You want the third song on the left.

Researchers working with large collections of music really really like to take those collections, stare at them cross-eyed and then push them onto a two-dimensional map of some sort.  If you don’t believe me, go here for 30 seconds and then we can continue.

A lot of the resulting visuals are beautiful and can be informative. Some look like 1994, but we’ll just assert that’s an intentional retro design aesthetic that was carefully perfected.

The thing is pretty pictures can only serve so much purpose. You’re still dealing with music and music needs to be heard. I’m interested in how interactive audio, primarily but not exclusively spatial audio, can be used to enhance or even replace some of these elaborate visuals.

One tool I’m currently messing around with is spatial databases for fast querying of spatial data.  I’ve just finished setting up a postgresql database  that is spatially-enabled (everyone loves enablers) with postgis.

I have a collection of about 50,000 tracks that have been analyzed by SoundBite. As a part of his MSc research at QM, Steve Lloyd took the feature vectors computed by SoundBite and arranged the 50,000 tracks on a 2D map. He used Landmark Multidimensional Scaling (LMDS) with 1000 randomly selected landmark points. He also fixed the maximum track-to-track dissimilarity to a ceiling to prevent outliers from ending up too far from the main group of tracks. So in summary, I have a collection of 50,000 songs, all the associated metadata, the audio content, and (x,y) coordinates placing each song on map.

So, a short tutorial on how to create a spatially-enabled database:

  1. Install a database and the required dependencies.  I’m using postgresql with postgis which requires Proj4 and  GEOS.  But there are other options.
  2. Create the database and set it up for spatial queries. At the command line:
    createdb [yourdatabase]
    createlang plpgsql [yourdatabase]
    psql -d [yourdatabase] -f postgis.sql
    psql -d [yourdatabase] -f spatial_ref_sys.sql

    If you used macports, you will find the sql files in /opt/local/share/postgis.

  3. Create table to hold any metadata.  In psql:
    CREATE TABLE mytable(id integer, track_name text);
    Note the semicolon, the statement won’t execute without it.
  4. Add a spatially-enabled column to hold the coordinate information.
    SELECT AddGeometryColumn('', 'mytable', 'mygeocolumn', -1, 'POINT', 2);
    The arguments are:

    • The schema name, we don’t use one here and could have used only 5 arguments.
    • The table name.
    • The new column’s name.
    • The SRID, a reference to mapping system to use, -1 means we aren’t using a pre-defined system.
    • The type of geometry.
    • The number of dimensions.
  5. Now to start inserting data into the table. This can be done row-by-row by through the insert command:
    INSERT INTO mytable(id, name, mygeocolumn)
    VALUES (1, 'First Song', ST_GeomFromText('POINT(0 5)', -1);

    ST_GeomFromText will create the proper datatype. Note that there is only a space between the two coordinate values.

  6. It’s likely that you don’t want to type out every individual track.  In my case that’d be 50,000 entries to process by hand.  Instead use the copy functionality by creating a text file describing the tracks and then copy that file into the table.  This has the benefit of letting postgres check things like primary keys so that the table isn’t half-built when an error is encountered leaving you with half a table.  Instead it’s all or nothing.  It’s fairly simple to create a text file according to whatever delimiter you’d like; it’s all well documented.  However, I had a problem trying to import geometry data.  This is the fix I found.  I make no guarantees that it won’t break something, but it has worked out for me so far.  In the text file, define the Points as:
    SRID=-1;POINT(6 10)
    Then to copy the text file into a table using a tab as the delimiter:
    \COPY mytable FROM '/the/path/to/myfile.txt' WITH DELIMITER E'\t'
  7. To take advantage of the postgis, you need to compute the spatial indices.
    CREATE INDEX myindexname ON mytable USING GIST(mygeocolumn);
    VACUUM ANALYZE mytable(mygeocolumn);
  8. Query away.  See the PostGis docs for tips on efficient querying, but here’s an example of finding all tracks that are within 10 units from (3,4).
    SELECT ST_AsText(mygeocolumn) AS location FROM mytable
    WHERE ST_DWithin(mygeocolumn, 'POINT(3 4)', 10.);