Skip to main content



Rhine API Graphs Semantic Distance Between Content

The Rhine API, developed by Speare and first revealed at Cornell University’s BigRed//Hacks hackathon, has taken content analysis and comparison to a new level.

When developing applications, there are a limited number of tools to compare data and determine their relevancy.  Arrays of characters convey no meaning to the words which they represent, and so do not provide a method of language parsing, processing, and understanding. This is where Rhine steps in.

Rhine provides an interface which allows the input of two English language words or phrases and computes their “semantic distance.” In other words, Rhine can determine whether they are related to each other in much the same way that a natural language speaker would think. To expand on this tool, Rhine can also determine whether two phrases are synonymous, and with the input of one phrase return a list of similar phrases.

How does Rhine do this? Though Rhine is closed source, one can imagine that it uses a network of edges and nodes to connect various phrases. The distance of the connections, i.e., the number of edges between two nodes, determines their relevancy: fewer edges equal a shorter semantic distance and thus more relevant results.  To differentiate between synonyms and “similar” phrases, a bucket style approach is possible (maybe, a graph within a graph?).  At a particular node, a list of synonyms is contained and with each branching edge, new nodes and lists are reached that are one unit of relevance away.

Some examples using the Rhine Semantic Distance Tool (The smaller score implies a closer connection).

  • “ferrari” and “car”: 50.70 Rhine Units
  • “ferrari” and “plane”: 223.00 Rhine Units
  • “ferrari” and “chair”: Not Related

Rhine API has been used in some fairly cool applications so far: Feud uses Rhine to create an online “Family Feud” style game. Tip of the Tongue is a word association game using Rhine with the focus of improving cognitive acuity and reducing the risk of aphasia-associated anomia.

Of course, there are still several kinks to be worked out that could be attributed to a relatively small dataset. There are several incongruent results: “au element” and “gold” return as synonyms, but “h element” and “hydrogen” do not. Hopefully, the developers are working to expand their graph through machine learning as users provide feedback on various matches.

To learn more about the Rhine API, visit www.Rhine.io

 

 

Comments

Leave a Reply

Blogging Calendar

September 2014
M T W T F S S
1234567
891011121314
15161718192021
22232425262728
2930  

Archives