Google Docs, the XML API, and the nightmare of importXML

I loathe importXML. Truly despise it.  Sure, it works for some things, but the moment that you want to get more than a single piece of XML data back from each call, it becomes a problem.

Sure, you can have an importxml call which has multiple attributes in the xpath entry, but that depends on the site never changing the order that comes back. Which isn’t the world’s best idea.

There is, however, a solution. It’s one that I’ve been looking at recently, when Mynnna linked a forum post to me about someone loading asset information into google docs. It’s a neat solution, especially because the asset api is a more complicated one, with nesting (Assets can exist within Assets).

That got me looking at other APIs, and led to the creation of a new github repository where I’m keeping the code of some new custom functions for loading data. I’m not using them myself, but that’s because I’m a bit of a database snob. I can do things there which would be an absolute PITA to do in a spreadsheet. However, I am a code junkie, so this lets me scratch that itch (I can do this better! Tada!)

 

I thought I’d explain a little of what goes into these functions, basing it off the simplest of them. The outpost loader function.

function loadOutposts(){
  var outposts= new Array();
  var url = "https://api.eveonline.com/eve/ConquerableStationList.xml.aspx";
  var parameters = {method : "get", payload : ""};
  var xmlFeed = UrlFetchApp.fetch(url, parameters).getContentText();
  var xml = XmlService.parse(xmlFeed);
  if(xml) {
    var rows=xml.getRootElement().getChild("result").getChild("rowset").getChildren("row");
    for(var i = 0; i < rows.length; i++) {
      outpost=[rows[i].getAttribute("stationID").getValue(),
                 rows[i].getAttribute("stationName").getValue(),
                 rows[i].getAttribute("stationTypeID").getValue(),
                 rows[i].getAttribute("solarSystemID").getValue(),
                 rows[i].getAttribute("corporationID").getValue(),
                 rows[i].getAttribute("corporationName").getValue()
                 ]
      outposts.push(outpost);
    }
  }
  return outposts;
}

If you want to have a function that returns a bunch of rows and columns, you have to build an array of data, and return that.

An array is a simple idea. It’s way of bundling up information. A single dimensional array is a list. A two-dimensional array is a table (like a spreadsheet). You can have more levels, which become somewhat harder to visualise (3 dimensional: a table where a cell can contain lists of data. and so on)

So you have to take the XML data, and format it into a 2 dimensional array. Thankfully, this works easily. Each row in the API, becomes a row in the array. You build a list, and add it to the bottom at the total array (known as a push)

Step one: get the XML. that’s what the UrlFetchApp.fetch line does. At this point, you just have a bunch of text, which happens to be XML.
Step two: turn the text into a data structure your code can deal with. That’s the XmlService.parse line. You now have a data structure that your code can access, and loop through.
Step three: Make sure it was actually XML you had. If the parser couldn’t handle it, it’d have made the xml variable false, so the if (xml) test would fail.
Step four: loop through it. We don’t care about anything but the rows, so we burrow down to that level, (the getChild and getChildren line). With that split off, we now have an array (a 1 dimensional list) of rows. The for loop iterates through that, with a counter that points to each entry. The address of the entry/
Step five: Create a list of data about each outpost. Because we’re working with a proper XML engine here, it’s case sensitive on attribute names, unlike importXML.
Step six: push the list of data about a specific outpost onto the bottom of the list of outposts
Step seven: Repeat steps five and six until it’s finished.
Step eight: Hand the array back, with a return statement.

That’s pretty much the core of all the functions. Some are a little more complex, because they’re turning typeids into typenames and so on. That’s done by generating an array where the address is the typeid, and the other entry is the name. You can see an example of that in the MarketOrders function, for stations and typeids. If you’re wondering why I’m adding a _ to it, it’s to make sure it’s treated as text, so no funny business with number comparison or arrays with holes happens.

For the Market Orders function, you’ll also notice it requires the moment.js library loaded. This is because the scripting engine behind google sheets handles dates very badly. moment.js makes it far easier to do something like add 90 days to a date.

Anyway, the required steps to get one of the scripts working (excluding anything like the moment.js library, and additional sheets, mentioned in the comments for specific scripts)

 

  • open up a new, or existing sheet.
  • Tools menu -> Script editor
  • If you haven’t done this before, for this sheet, it’ll ask you what you want to create a script for. pick ‘Blank Project’
  • Delete the text which it shows ( the function myFunction() { } stuff. )
  • paste in the text for the function of mine you want to add.
  • As long as there aren’t other specific steps, you’re now done. The Eve Central and outpost scripts don’t need anything else.

At this point, you can just use the functions in your regular sheets 🙂

 

If there are any APIs you can’t get working, give me a yell, and I may be able to help.

Google Docs, the XML API, and the nightmare of importXML by Fuzzwork Enterprises is licensed under a Creative Commons Attribution 4.0 International License.

  • Lol.

    Spreadsheets – the least efficient way to process data…

  • atcq

    o/

    i was going crazy using importXML in my spreadsheet for reactions, it worked for a few hours/minutes and then showed an internal error and stopped working.
    with this it’s so easy to pull the data and it showed me how easy it is to script google sheets!

    thx from here 🙂

    • Zad Murrard

      When switching from old gdocs to new gdocs you can also for starters replace the importXML with your own. Here’s the one I’m using. Note that it supports only small part of xpath and thus everything will not work, also might have errors etc.

      function importXml2(targetUrl, xpathQuery)

      {

      // Edit these for debugging

      //targetUrl = “dddu”

      //xpathQuery = “”

      var xmlResponse = UrlFetchApp.fetch(targetUrl).getContentText()

      var document = XmlService.parse(xmlResponse)

      var querys = getQuerys(xpathQuery)

      if (typeof querys == ‘string’) // one query

      {

      return parseXmlPart(document.getRootElement(), querys)

      }

      else

      {

      var answer = new Array();

      var len = querys.length

      for (var i = 0; i < len; ++i)

      {

      answer = answer.concat(parseXmlPart(document.getRootElement(), querys[i]))

      }

      return answer

      }

      function getQuerys(xpathPart)

      {

      var querySplit = xpathPart.indexOf(" | ")

      if (querySplit < 2)

      {

      return xpathPart

      }

      var querys = new Array()

      querys.push(xpathPart.substr(0, querySplit))

      querys = querys.concat(getQuerys(xpathPart.substr(querySplit)))

      return querys

      }

      function parseXmlPart(xmlToParse, xpathPart)

      {

      if (xpathPart[0] != '/')

      {

      return ""

      }

      if (xpathPart[1] == '@')

      {

      var attrValue = xmlToParse.getAttribute(xpathPart.substr(2)).getValue()

      var number = parseFloat(attrValue)

      if (number != NaN)

      {

      return number

      }

      else

      {

      return attrValue

      }

      }

      if (xpathPart[1] == '/')

      {

      if (xpathPart[2] != '@')

      {

      var subPart = xpathPart.substr(2)

      var nextSplit = subPart.indexOf('/')

      var childToLook = subPart.substr(0, nextSplit)

      var remainingPath = subPart.substr(nextSplit)

      var matchingChildren = getMatchingChildren(xmlToParse, childToLook)

      var response = new Array()

      var len = matchingChildren.length

      for (var i = 0; i 0)

      {

      var childToLook = subPart.substr(0, nextSplit)

      var remainingPath = subPart.substr(nextSplit)

      if (xmlToParse.getName() != childToLook)

      {

      var response = new Array()

      var children = xmlToParse.getChildren(childToLook)

      var len = children.length

      for (var i = 0; i < len; ++i)

      {

      response = response.concat(parseXmlPart(children[i], remainingPath))

      }

      return response

      }

      else

      {

      return parseXmlPart(xmlToParse, remainingPath)

      }

      }

      else

      {

      var response = new Array()

      var children = xmlToParse.getChildren(xpathPart.substr(1))

      var len = children.length

      for (var i = 0; i < len; ++i)

      {

      response.push(children[i].getValue())

      }

      return response

      }

      }

      return ""

      }

      function getAttributesFromChildren(xmlPart, attributeName)

      {

      var attrValues = new Array()

      var value = xmlPart.getAttribute(attributeName)

      if (value != null)

      {

      attrValues.push(value)

      }

      var children = xmlToParse.getChildren()

      var len = children.length

      for (var i = 0; i < len; ++i)

      {

      attrValues = attrValues.concat(getAttributesFromChildren(children[i], attributeName))

      }

      return attrValues

      }

      function getMatchingChildren(xmlPart, childName)

      {

      var matchingChildren = new Array()

      var children = xmlPart.getChildren()

      var len = children.length

      for (var i = 0; i < len; ++i)

      {

      if (children[i].getName() == childName)

      {

      matchingChildren.push(children[i])

      }

      matchingChildren = matchingChildren.concat(getMatchingChildren(children[i], childName))

      }

      return matchingChildren

      }

      }

  • Case

    Edit – Thank you. Importxml is a horrible terrible thing. This function rocks, like I’m gonna get you shitfaced at Fanfest 2015 rocks.

    Would it be possible to get a volume pull from eve-marketdata? I use this, average the 30 days, and then it gives me a rough daily average. http://api.eve-marketdata.com/api/item_history2.xml?char_name=demo&region_ids=10000064&type_ids=2905

    I tried to make my own function, but it’s not going well…

    EVE-Central’s volume has been borked lately, showing volumes in the thousands when only a few are traded daily.

    • Fuzzysteve

      (sorry for the delay. no notification 🙁 )

      Looking at the eve central one, to get it working with the evemarketdata api:
      update the url
      var rows=xml.getRootElement().getChild(“marketstat”).getChildren(“type”);
      should become
      var rows=xml.getRootElement().getChild(“result”).getChild(“rowset”).getChildren(“row”);

      replace all the rows[i] bits with:
      parseInt(rows[i].getAttribute(“typeID”).getValue()),
      parseInt(rows[i].getAttribute(“regionID”).getValue()),
      rows[i].getAttribute(“date”).getValue(),
      parseFloat(rows[i].getAttribute(“lowPrice”).getValue()),
      parseFloat(rows[i].getAttribute(“highPrice”).getValue()),
      parseFloat(rows[i].getAttribute(“avgPrice”).getValue()),
      parseInt(rows[i].getAttribute(“volume”).getValue()),
      parseInt(rows[i].getAttribute(“orders”).getValue())

      and that /should/ be you.

      • Case

        Definitely gonna buy you a beer in Iceland.

        Thanks!

        • Fuzzysteve

          If you have any trouble, drop me an evemail, or email 🙂

  • Joel

    Why would I be getting the error “cannot find function forEach”?

  • Pingback: Tools of the Trade: Spreadsheets | The Neocom()