Elog.io

OpenSource

  • Duration: 3 months
  • Team size: 1 person

Description

Elog.io is a good example of an open source and free software. The products itself is a tool set for web researches. It integrates with the browser and then, when the images are detected on a page, identifies images in the Elog.io database to find attribution information.

Basically, it allows users to easily find an answer to the question: is this photograph I'm looking at openly licensed? If no, who holds the copyright? They can also copy a photograph and its attribution easily into programs like Wordpress, LibreOffice and Word. The solution is based on advanced perceptive hash algorithm to find visually similar images even if they were colored, resized or cropped. Currently, the database includes over 22 million images from Wikimedia Commons. The image matching algorithm understands basic things like resizing and color change. 

Challenge

We had to work in an open-source style, with relaxed deadlines but clear goals to meet. The client supplied the back-end team. We were making a browser extension which would inject a script into a page and then find all images. Specifically, it scanned for images on the page and talked to an API to identify, annotate and mark each image with license and author details. It would tell the user whether this is a known image of a known author, or it has been found on the web. The images were presented in a DOM tree in multiple ways, and it was not always easy to detect them all. Moreover, the interface and logic had to be made consistent with a vast amount of websites available. Given the extension nature of the application, a special care had to be taken to avoid javascript conflicts - for example, jQuery could be loaded by both the target page and our extension.  

Solution

Logicify set up a team of a single person and technical advisory access to accomplish this project. The developer had to analyze a large sample of possible website designs and do extensive testing of the resulting extension to make sure the injected script would not interact with the target websites currently opened in a tab. The extension was written specifically for  Google Chrome. The algorithm for calculating matches was developed on the server side. Our goal was to work reliably with the heterogeneous world of web pages to locate images perfectly. The images found in the DOM were hashed , so the algorithm had to be carefully tuned to meet memory and processor limitation of the typical commodity hardware. The product was completed and released into Google Chrome store. 

Technology

  • Back-end: Node.js
  • Front-end: JavaScript