To exemplify how to use the several screen scraping methods and the practical differences between them, let’s first scrape a Notepad window with some text and see what results we have. The following screenshot is what we used.
![image_156.png 701](/files/b465f6a-image_156.png)
The FullText method
![screen_scraping_fulltext.png 1127](/files/4cf4077-screen_scraping_fulltext.png)
As you can see, no formatting is retained, but if you hide the Notepad window while scraping, the text is still retrieved. This is the fastest method.
The Native method
![native1.png 1127](/files/b191648-native1.png)
![native2.png 1127](/files/17ec4b3-native2.png)
As you can see in the first screenshot, you can extract the text with its position on the screen, as well as retrieve the exact position of each word (second screenshot).
The Microsoft OCR method
![micro_ocr.png 1127](/files/0a0ca4b-micro_ocr.png)
As you can see, the accuracy of this output method is not 100%, but it still manages to keep the position of the text. Getting the exact on-screen position, in pixels, is also available yet as you can see, it is not the fastest of the output methods.
The Google OCR method
![google_ocr.png 1127](/files/7d7ab45-google_ocr.png)
As with Microsoft’s Modi, the Google OCR method is not 100% accurate and takes longer when compared with the others. However, it retrieves the position within the window of the text.
Now, add some white text over a black page in Paint, for example, and try to scrape it.
![image_162.png 494](/files/909ef64-image_162.png)
As you can see, only the OCR methods work in this scenario.
![screen_scraping_paint.png 1102](/files/dafe80c-screen_scraping_paint.png)
Now let’s try scraping an application and see the results. We use a dummy expense app, which you can download here.
![image_164.png 734](/files/a144be7-image_164.png)
If we scrape this entire window, we receive the following results:
- FullText with hidden text works really well, being able to read even the minimize and restore buttons.
![screen_scraping_expenseit.png 1103](/files/2baf1a0-screen_scraping_expenseit.png)
- Native does not work on this UI as it does not make use of the graphical device interface to render text. For more information on GDI, please see the official Microsoft documentation.
- Microsoft OCR works pretty well, although accuracy is still not 100%.
![screen_scraping_micro1.png 1103](/files/58c8f17-screen_scraping_micro1.png)
- Google OCR does not handle this UI very well, as the scraped area is quite large.
![screen_scraping_google2.png 1103](/files/cad999f-screen_scraping_google2.png)
Updated 3 years ago