Cheminfo-to-web Project

Overview

In this project, we did some experimental work of porting a series of C/C++ chemoinformatics libraries into JavaScript via Emscripten. Such a method might be the fastest approach to bring existed native chemoinformatics softwares into web era, with only a little modification (or none at all) of source code and a recompiling process. The compiled result is highly optimized JavaScript code which can be run across different web browsers with fairly fast speed.

Currently, four different C/C++ chemoinformatics libraries / packages are ported by us as explained below. You may download them in download page or our GitHub page. It should be reminded that it is possible (and not difficult) to build your own JavaScript compilation of those libraries to adapt your own application, with different function exports, class bindings and so on. The building instructions can be found in GitHub page of this project.

InChI

The official open source library to support IUPAC InChI, an open-source chemical structure representation algorithm. It is written in plain C with a set of API function to read/write InChI. Most of those APIs are involving complex structures or pointers and are quite difficult to be represented directly in JavaScript. So a wrapper function: molToInchiJson is provided. It receives MDL MOL format molecule data string and output stringified JSON format InChI information (including InChI string and AuxInfo). More details can be found in our InChI demo.

Indigo

Indigo is an open source universal organic chemistry toolkit developed by EPAM. It supports I/O of some common molecule formats (including Molfiles/Rxnfiles, SDF, RDF, CML and SMILES/SMARTS), automatic layout for SMILES-represented molecules and reactions, many molecule based algorithms (including substructure matching, tautomers matching, molecule fingerprinting, molecule similarity computation and enumeration of SSSR rings).

The toolkit is written in C++. However, it also shipped with a plain C API interface, which helps the Emscripten compiler to export functions to JavaScript with ease. In our current compilation, most of the functions in C API has been ported, and the JS lib is now almost as powerful as the native one. Some of those functions are demonstrated in Indigo demo.

OpenBabel

OpenBabel is a well-known open source chemoinformatics toolkit, famous for its ability to convert between dozens of different chemical formats. It could be extremely helpful when compiling Open Babel into JavaScript and integrating with existing web based chemoinformatics libraries.

The OpenBabel itself is written in C++, with packs of C++ classes and objects. Emscripten provides two options to export C++ classes to JavaScript: Embind and WebIDL binder. Here, the former is used by us. Some helper classes are also created to wrap certain functions and expose to JS code. The OpenBabel demo demonstrates usage of the JS compilation, especial on chemistry data I/O.

OpenMD

OpenMD is an open source C++ molecular dynamics engine which is capable of efficiently simulating liquids, proteins, nanoparticles, interfaces, and other complex systems using atom types with orientational degrees of freedom (e.g. “sticky” atoms, point dipoles, and coarse-grained assemblies). This engine is also ported by us to test the efficiency of compiled JavaScript code on heavy calculation tasks. It is quite astonishing that the JS compilation is only 2 times slower than the native one in some web browsers.

Currently OpenMD does not provide a clear API interface, so some wrapper classes are created to help perform general calculation tasks. The original OpenMD program is heavily relying on files, so a virtual file system is also used in JS compilation. The OpenMD demo shows how to use the JS compilation to run calculation jobs in web browser.