Digitalize your receipts

We are preserving our planet by helping you with your finances

Have you ever tried to do your accounts with your receipts ?


via GIPHY

Yes... It's often very boring...Doing your accounts using your receipts in paper format is very time consuming. It is obviously a waste of time and over time, as we make purchases, we don't have time to classify them. In the end, you often lose them in your wallet. We have found the solution to this problem by using Machine learning.
"Machine Learning is the practice of using algorithms to parse data, learn from it, and then make a determination or prediction about something in the world. So rather than hand-coding software routine, the machine is “trained” using large amounts of data and algorithms that give it the ability to learn how to perform the task."

MICHAEL COPELAND


Source: Le Figaro


Text recognition

The following project is based on supervised learning, we will have to make our Ai learn key characteristic of paper receipts. Then, once our model will be trained, it will be able to identify (with a certain degree of precision) classical paper receipts of the differents signs. We'll also use ML5.js which is a JS library based on TensorFlow (Open source ML library).
"To make all that works, we well use "text classification" which is the task of assigning a set of predefined categories to open-ended. Text classifiers can be used to organize, structure, and categorize pretty much any kind of text. For example, chat conversations can be organized by language; brand mentions can be organized by sentiment; and so on." Source : MonkeyLearning









Source: Nanonets

Form these examples we can draw out some attributes of the OCR tasks:



  • Text density: on a printed/written page, text is dense. However, given an image of a street with a single street sign, text is sparse.
  • Structure of text: text on a page is structured, mostly in strict rows, while text in the wild may be sprinkled everywhere, in different rotations.
  • Fonts: printed fonts are easier, since they are more structured then the noisy hand-written characters.
  • Character type: text may come in different language which may be very different from each other. Additionally, structure of text may be different from numbers, such as house numbers etc.
  • Artifacts: clearly, outdoor pictures are much noisier than the comfortable scanner.
  • Location: some tasks include cropped/centred text, while in others, text may be located in random locations in the image.



  • Source: Nanonets

    Our system will analyse the key words that is transcribed in the text as it has already been done with OCR. Optical character recognition tools are undergoing a quiet revolution as ambitious software providers combine OCR with AI. nets done it with Invoices. Let's take a look at the code using JSON.
    As a consequence, data capturing software is simultaneously capturing information and comprehending the content. Thanks to our algorithm, AI will be able to understand what type of receipt it is. OCR can see key elements we want to see on our screens. Thanks to artificial intelligence, let's optimise our time to be better every day.



    Coding exemple using JSON by Nanonets






    A Beginner's Guide to Machine Learning with ml5.js








    Times have changed. Make your life better. Download our system.


    By Ghislain Willaume

    For Designing with web