Page 1 of 1

Getting cElementTree working in Ren'Py [SOLVED using json]

Posted: Sat Feb 28, 2015 1:56 pm
by CSPlusC
Hi all! While ElementTree works out of the box in the game, I'm having issues getting cElementTree working and am hoping someone can point me in the right direction. I suspect part of the problem is that cElementTree requires the C libraries to be bundled, and even placing them inside the Ren'Py lib directories is only a partial solution.

I'm using Articy XML exports as the main data source for the game (Articy is much faster for doing non-linear dialogue than hand-writing scripts), but unfortunately loading times are creeping up steadily. I've gone through a couple rounds of optimizing, and at this point when I run a profiler it reports that out of 28 seconds my startup code is taking 3 seconds while ElementTree is taking 25 to parse the XML so getting cElementTree working is the clear next step.

Anyway, here's what I've tried:
1) I started with the basic "import cElementTree as ET", which failed due to not finding the class. I Googled "renpy celementtree" and found a match in the Git hub but nothing useful, so I tried some other changes...

2) I've had success getting standard Python modules working in Ren'Py by taking their .py files from python27/lib and copying them to my game source code directory, and tried copying over cElementTree.py. That got past the first error but then gave a new error of "Unable to import _elementtree".

3) On research, that error means that the C library files are missing. I tried coping over to _elementtree.lib and _elementtree.pyd from Python27/libs to my game directory, and the import error remained. I then copied them to renpy-6.18.3-sdk\lib\pythonlib2.7 and was able to start the program.

4) The program would start...but then immediately threw the exception below:

Code: Select all

I'm sorry, but an uncaught exception occurred.

While executing init code:
  File "game/script.rpy", line 31, in script
    init python:
  File "game/script.rpy", line 104, in <module>
    articy = Articy()
  File "game/articy.rpy", line 60, in __init__
    tree = ET.parse(xml_file)
RuntimeError: cannot load dispatch table from pyexpat
5) On research this pyexpat error is most often resolved on Linux platforms by installing additional libraries. At this point, I'm concerned I'm going beyond what I should be doing for something that needs to be distributed with Ren'Py and am asking for help before potentially digging myself a hole.

I welcome suggestions! I figure there's a knack to including C libraries with Ren'Py, and am happy to test different approaches.

Re: Getting cElementTree working in Ren'Py

Posted: Sat Feb 28, 2015 9:38 pm
by SusanTheCat
Are you forced to use cElementTree?

The reason I ask is I have successfully used

Code: Select all

import xml.etree.ElementTree as ET
tree = ET.parse(renpy.loader.transfn(rv))
Susan

Re: Getting cElementTree working in Ren'Py

Posted: Sun Mar 01, 2015 12:26 am
by PyTom
I'd suggest either using straight-up elementtree, or pre-converting the data into json or something else more usable. Pre-conversion will likely make the data load faster, and means you don't have to ship dependencies for the many platforms Ren'Py runs on. (Getting something to build on six platforms - win32, osx64, linux32, linux64, android, and ios - is kind of a challenge.)

Re: Getting cElementTree working in Ren'Py

Posted: Sun Mar 01, 2015 10:31 am
by CSPlusC
SusanTheCat wrote:Are you forced to use cElementTree?

The reason I ask is I have successfully used

Code: Select all

import xml.etree.ElementTree as ET
tree = ET.parse(renpy.loader.transfn(rv))
Susan
Yes, ElementTree works easily out of the box and I have been using it as a starting point. My issue is optmization: adding dialogue is steadily slowing down startup, and it's especially noticeable on a teammate's older laptop so I'm concerned the final loading times will be too great. cElementTree should be much faster, though from what PyTom is pointing out that sounds like it'd be a dangerous route to go. Ah well, thanks for weighing in!
PyTom wrote:I'd suggest either using straight-up elementtree, or pre-converting the data into json or something else more usable. Pre-conversion will likely make the data load faster, and means you don't have to ship dependencies for the many platforms Ren'Py runs on. (Getting something to build on six platforms - win32, osx64, linux32, linux64, android, and ios - is kind of a challenge.)
Good point on the dependencies...sounds like a support nightmare for different platforms. I'll stick to basic ElementTree for internal testing.

I've had success using the dill module to speed things up: whenever I change the XML data the program will generate a dill file of its memory state and then load that on subsequent starts. Doing that, I'm averaging 8.5 seconds start time on ElementTree and < 3 seconds on Dill which is good enough for development.

When optimizing a version for distribution I will try your suggestion of pre-converting the data...I saw some benchmark data that the pure Python Json module destroys ElementTree, so that sounds worth doing. It might be faster than going the dill/pickle route too.

Thanks,
Alex

Solved using this XML to JSON script

Posted: Sat Jun 27, 2015 12:39 pm
by CSPlusC
I finally solved this issue and figured I'd offer the fix for anyone else who is dealing with it. PyTom is right, it's far better to use JSON than mess around with getting XML optimized. Articy still only exports to XML, so what I ended up doing is setting up a test where at program load it checks to see if a) there is a JSON file present and b) if the JSON file is newer than the XML one. If either test is false, then it assumes you just exported new XML data and it will convert it.

To make this as fast as possible, I added these two modules to the Ren'Py game directory:
1) ujson (https://pypi.python.org/pypi/ujson) which is way faster than the default json module. Making it work in Ren'Py was a little tricky, I ended up putting ujson.pyd directly inside my game directory, then placed ujson.pyd inside my Ren'Py source folder at \renpy-6.18.3-sdk\lib\pythonlib2.7.

2) xmltodict (https://github.com/martinblech/xmltodict) which is also much faster than a native Python module for converting XML to a dictionary. Once you have a massive XML dictionary, you can just pass it to the json module and then save your data. It was easy to use this in Ren'Py, I just dropped the source code folder directly inside my game directory

Once you have those two in place, this code will do the conversion:
import sys
import os
import os.path
from datetime import datetime
# Use xmltodict as the converter for Articy XML to JSON data
import xmltodict as XD
# Path on dev system to get usjon working: D:\renpy-6.18.3-sdk\lib\pythonlib2.7. See https://pypi.python.org/pypi/ujson/ for more details
try:
import ujson as json_module
except:
# In the event of error, import old-school json
import json as json_module

def main():
# Main data source: Articy exported XML. Load all content
# Optimization: JSON is vastly faster than XML, so unless there is new data load from JSON. If there is new XML data, load it and convert to JSON for future speed-ups
json_filename = 'test_file.json'
xml_filename = 'test_file.xml'
# Ren'Py note: use the following statements to run this from the game directory:
# json_file_path = config.gamedir + '/test_file.json'
# xml_file_path = config.gamedir + '/test_file.xml'
# Non-Ren'Py version that can be run from wherever:
json_file_path = './%s' % json_filename
xml_file_path = './%s' % xml_filename

# Toggle loading fresh XML vs JSON. Test to see if we need to create a new JSON file
create_json_file = False
if os.path.exists(json_file_path):
# Compare its timestamp to the Articy XML file's one
json_timestamp = os.path.getmtime(json_file_path)
xml_timestamp = os.path.getmtime(xml_file_path)

# Is the JSON file more recent than the XML? If so, load it instead of the XML
if json_timestamp < xml_timestamp:
# Flag creating JSON file
create_json_file = True
else:
# If there is no JSON file, we need to create it
create_json_file = True

# Was the flag set to create a JSON file? If so load the XML file and parse it, then get the root of the tree
if create_json_file:
xml_file = file(xml_file_path)
articy_start_time = datetime.now()
print "Starting xmltodict at %s" % articy_start_time
tree = XD.parse(xml_file)
print "Done xmltodict with load time of %0.2f seconds" % (datetime.now() - articy_start_time).total_seconds()

# Save it to JSON to skip this step next time
# Ren'Py note: if running this inside the game engine use this line: json_file = open(config.gamedir + json_filename, 'w')
json_file = open(json_file_path, 'w')
articy_start_time = datetime.now()
print "\nStarting JSON dump at %s" % datetime.now()
# Renpy note: use the statement "open(config.gamedir + "/pmc.json", 'w')" instead of what is below to use Ren'Py's configuration details
with open(json_file_path, 'w') as json_file:
json_module.dump(tree, json_file)
json_file.close()
print "Done JSON dump with save time of %0.2f seconds" % (datetime.now() - articy_start_time).total_seconds()

# Load data from JSON. This is called whether we created a new JSON file above or not
articy_start_time = datetime.now()
print "\nStarting JSON load at %s" % articy_start_time
# Ren'Py note: the original statement here is "open(config.gamedir + "/pmc.json", 'r')" to use the configuration variable for the path
with open(json_file_path, 'r') as json_file:
articy_data = json_module.load(json_file)
json_file.close()
print "Done JSON load with time of %0.2f seconds" % (datetime.now() - articy_start_time).total_seconds()

if __name__ == "__main__":
main()
A couple notes on using this script:
1. For some reason all of my tabs were lost in the copy/paste, you will need to fix those in your code.

2. In several places I have comments that start with "Ren'Py note:". Check these for changes to make it you're running it inside your game directory, as inside Ren'Py you can use the config.gamedir shortcut to get a full path to your files

3. ujson is optional. If you don't have it on your system the script will use the standard json module, so don't sweat that one so much. You do need to download xmltodict

Right now I have a 16MB Articy XML export, and here's how long it takes using ujson:
$ python test_xml_to_json.py
Starting xmltodict at 17:30:14.083000
Done xmltodict with load time of 17.23 seconds

Starting JSON dump at 17:30:31.310000
Done JSON dump with save time of 0.31 seconds

Starting JSON load at 2015-06-27 17:30:31.618000
Done JSON load with time of 0.31 seconds
And here are the results using the standard Python json module:
$ python test_xml_to_json.py
Starting xmltodict at 2015-06-27 17:34:07.335000
Done xmltodict with load time of 16.73 seconds

Starting JSON dump at 2015-06-27 17:34:24.064000
Done JSON dump with save time of 4.08 seconds

Starting JSON load at 2015-06-27 17:34:28.143000
Done JSON load with time of 0.66 seconds
Note that the standard module is almost as fast at reading, but is much slower at writing. When you're developing you're really going to want that speed boost at the writing step as otherwise Articy conversions feel like they take forever.