Tuesday, May 17, 2016

Google's SyntaxNet installation steps on Mac OS X El Capitan!

TensorFlow 
TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. 

The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. 

TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.

SyntaxNet 
An open-source neural network framework implemented in TensorFlow that provides a foundation for Natural Language Understanding(NLU) systems. Google's release includes all the code needed to train new SyntaxNet models on your own data, as well as Parsey McParseface, an English parser that they have trained for you and that you can use to analyse English text.

Parsey McParseface
Parsey McParseface is built on powerful machine learning algorithms that learn to analyze the linguistic structure of language, and that can explain the functional role of each word in a given sentence.

INSTALLATION 
1.
Download Bazel v0.2.2 only.
Download the Bazel installer for your operating system.

Run the installer:

$ chmod +x bazel-version-installer-os.sh
$ ./bazel-version-installer-os.sh --user

The --user flag installs Bazel to the $HOME/bin directory on your system and sets the .bazelrc path to $HOME/.bazelrc.

Set up your environment
If you ran the Bazel installer with the --user flag as above, the Bazel executable is installed in your $HOME/bin directory. It's a good idea to add this directory to your default paths, as follows:
$ export PATH="$PATH:$HOME/bin"
You can also add this command to your ~/.bashrc file.

2.
brew install swig

3.
brew install python

4.
protocol buffers, with a version supported by TensorFlow:

check your protobuf version with 
pip freeze | grep protobuf

upgrade to a supported version with 
pip install -U protobuf==3.0.0b2

5.
pip install asciitree

6.
pip install numpy

7.
Once you completed the above steps, you can build and test SyntaxNet with the following commands:

git clone --recursive https://github.com/tensorflow/models.git
cd models/syntaxnet/tensorflow
./configure
cd ..
bazel test --linkopt=-headerpad_max_install_names \
>syntaxnet/... util/utf8/...

Bazel should complete reporting all tests passed.

8.
Parsing from Standard Input
Simply pass one sentence per line of text into the script at syntaxnet/demo.sh. The script will break the text into words, run the POS tagger, run the parser, and then generate an ASCII version of the parse tree:
echo 'Bob brought the pizza to Alice.' | syntaxnet/demo.sh

Input: Bob brought the pizza to Alice .
Parse:
brought VBD ROOT
 +-- Bob NNP nsubj
 +-- pizza NN dobj
 |   +-- the DT det
 +-- to IN prep
 |   +-- Alice NNP pobj
 +-- . . punct


The ASCII tree shows the text organized as in the parse, not left-to-right as visualized in our tutorial graphs. In this example, we see that the verb "brought" is the root of the sentence, with the subject "Bob", the object "pizza", and the prepositional phrase "to Alice".

Note: I have installed SyntaxNet on my Mac successfully by following all steps above.

References:
1. https://github.com/tensorflow/models/tree/master/syntaxnet
2. http://googleresearch.blogspot.in/2016/05/announcing-syntaxnet-worlds-most.html
3. http://stackoverflow.com/
4. https://www.tensorflow.org/
5. http://bazel.io/docs/install.html