Improving Artist\Album\song title pattern matching
Moderator: Gurus
Improving Artist\Album\song title pattern matching
This is just an idea for improving the quality of the data in the database. Obviously, the CDDB database is very useful and is a popular way of populating MP3 tags. One problem with it is that it is that data is inconsistent. For example, one user may enter a artist name in CDDB as R.E.M and another may enter the same artist name as R E M. The problem also occurs in song titles and album names and makes the database messy once imported into Songs-DB. Garbage in, Garbage out to coin a phrase. Spelling checkers are not good at resolving problems in this kind of data because much of the time the text isn't from dictionary words. In the past, I've had some success in resoving problems in data like this by using the SoundEx algorithm which gives similar codes for data that sounds the same. See http://physics.nist.gov/cuu/Reference/soundex.html. Obviously, some kind of manual intervention would be required to select which of the matching records was the correct one before changing a files properties. I haven't seen this functionality in any other music managers but what do you think?
I think that the idea is good, somethink like that could be added - not now, there are some higher priority tasks, but later.
Possibly even better will be, when Songs-DB will have automatic song regonition based on the audio data. Then with help of a huge database of (almost) all songs it will be easier to handle users' collections. There are currently some attempts to build such a database, maybe some of them will be implemented in Songs-DB in the future.
BTW, the link you supplied is wrong, but fortunatelly there is enough information about SoundEx algorithm on the web.
Possibly even better will be, when Songs-DB will have automatic song regonition based on the audio data. Then with help of a huge database of (almost) all songs it will be easier to handle users' collections. There are currently some attempts to build such a database, maybe some of them will be implemented in Songs-DB in the future.
BTW, the link you supplied is wrong, but fortunatelly there is enough information about SoundEx algorithm on the web.