This class focuses on textual analysis using XML and related technologies. Some courses (such as the Digital Atlas Design Internship) are deficated geospatial methods, such as creating illustrative, interactive, digital maps. Oftentimes digital mappers find some kind of preexisting, tabular data that associates some kind of data with a geographical coordinate as the basis for these projects.
However, textual methods, and geospatial ones, are not necessarily as separate as they might seem at first glance. The XML hierarchy of a text provides structure that can be transformed into geographical data. For instance:
If we want to extract geographic information from our XML text, we need three ingredients:
As a first step, we need to understand our output format, which brings us to GeoJSON. GeoJSON is a format for encoding a variety of geographic data structures becaus it is (relatively) human-readable, and can associate descriptive and quantitative data with points, lines, and polygons. Read this tutorial, which explains the basic format and syntax of GeoJSON. Like all coding languages, the format is finicky: slightly messing up commas and spaces can result in an unreadable output file, so pay careful attention.
The linked tutorial clearly explains how to format different kinds of shapes using GeoJSON. One additional point to consider is the proprerties field associated with a GeoJSON file:
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [-79.91877256652567,40.441063867901882]
},
"properties": {
"name": "Commonplace Coffee",
"type": "coffee_shop",
"occupancy": "15"
}
}
You can create any number of properties to associate with a given object (geographic point or set of points) and assign them any values you wish. For instance, this is where you might assign a numerical value: for instance, number of mentions of a given place, which you could use to make a heat map. This would be the equivalent of an additional column of information associated with the coordinates if your output was in tabular format (e.g., CSV).
In order to map something, it is not enough to know that we are dealing with a string that refers to a location: we also need to know the lattitude and longitude. In principle, we could simply encode the coordinates as attributes (e.g., <location uid="pittsburgh" lattitude="-79" longitude="40">Steel City</location>), but that would be cumbersome and repetitive. Instead, we are going to learn about the XSLT map function. (Warning: this function is called "map" because it maps one value to another value, not because it has anything specifically to do with the topic of this assignment!)
The XSLT map function essentially translates between key-value pairs. Then you can call upon the map to swap in the new value every time it sees a key value. For instance, in our code for this assignement:
<xsl:variable name="coordinates" as="map(xs:string, xs:string)">
<xsl:map>
<xsl:map-entry key="'commonplace'" select="' -79.91877256652567,40.441063867901882'"/>
<xsl:map-entry key="'ascend'" select="' -79.8961075427762,40.45097067558358'"/>
<xsl:map-entry key="'posvar'" select="' -79.95426787957021,40.44125235992703'"/>
</xsl:map>
</xsl:variable>
This is how we can get from a unique ID (e.g., <location uid="pittsburgh">Pittsburgh</location>) to the coordinate values without having to encode the coordinates every time the place is mentioned in our text.
Now let's experiment with moving from XML input information to XSLT transformation to a GeoJSON output:
Now that you have some dots, see if you can write new XSLT code to connect the points on your journey "as the crow flies."
The following example illustrates the format for a line in GeoJSON. Think about which parts of this would need to be dynamically produced by an XSLT transformation.
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": [
[-79.91877256652567, 40.441063867901882],
[-79.91762256652567, 40.440063867901882],
[-79.91662256652567, 40.439063867901882]
]
},
"properties": {
"name": "line_1"
}
},
{
"type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": [
[-79.91562256652567, 40.438063867901882],
[-79.91462256652567, 40.437063867901882],
[-79.91362256652567, 40.436063867901882]
]
},
"properties": {
"name": "line_2"
}
}
]
}
Your XML file is organized into days, and for each day you have any number of locations mentioned. For each day you want to give a unique name for the line (one possibility is using the position() function to determine which day you are on).
Then for each <location/> element you will need to (a) pull in the unique location ID, and then (b) use your map function to swap it for the coordinates.
This is an in-class assignment. So long as you have updated at least some part of the XML and XSLT, submit whatever you have completed (even if you were unable to solve Part V).