Item Representation: DataNatuRe, Ecoinform

Joghurt Mahlitzsch

Both, DataNatuRe as well as Ecoinform are currently primarily for german data consumers – perhaps that will change some time in the future.

For some time now, DataNatuRe has been building a product database for the organic and natural goods trade. In addition to data from our proprietary “Ökobox”-pool and that of the pioneer in this field, Ecoinform, our shops can now also display data from this new source.

In the online shop we need descriptions, pictures, exact manufacturer information and details of the ingredients. This is important for displaying, filtering and organizing information on the web pages. However, some of these are also important in the ERP backend. Anyone who has already registered new articles in the system knows how time-consuming this process can be.

Of course, it is as simple as possible for the seller if this data is complete and uniform, because then the import can be done automatically. Unfortunately, this is not always the case: why is it so difficult to get good article information for the online shop? Or to put it another way: why is it so difficult to build a unified database?

On the one hand, not every manufacturer has the capacity to present its products well for all sales channels. For example, an image should usually be high-resolution and rich in detail for printing, while for an article list in the online shop a striking and less data-intensive smaller image is more likely.

On the other hand, a central database is used by many people in different positions. They often have a very specific look at the data: Bookkeepers, for example, tend to look at prices rather than pictures, while marketing people focus on the text and the picture, rather than the ingredients. Statisticians compare merchandise categories and need the articles in a certain order. Some mean well and fill in the fields for description-long, descriptive-medium and description-short, others are in a hurry and write a sentence in the “-lang” field.

The article information also differs according to the product group – i. e. there are different data fields for a cheese (e. g. fat content, rind or rind type) than for a wine (drinking temperature or decanting time). And then the structure and type of data are not the same: the fat level is expressed in %, but there must be clear indications in the country of origin. Allergens can be one or 10, certifiers usually have one, but sometimes several.

This complexity is even evident in DataNatuRe’s rather young database, which is designed with modern means and on a broad basis. This way we get all the information for our system, but the preparation is not easy:

  • From the large number of data fields (more than 1000 in total) the relevant data fields for filtering or display in the online shop must be selected – even though the merchandise category assignment in the individual shops usually does not correspond to the data supplier’s specifications.
  • Many numerical values are not entered homogeneously and cannot be compared easily, e. g. during filtering (100ml are 0.1l, but a computer does not know this easily)
  • Despite a lot of trouble with the input masks, I have already seen vegans, which is explicitly not vegetarian, or a chicken broth which explicitly does not contain pork (oh).
  • It often happens that some articles have a super detailed description, others don’t – if these articles stand side by side in the shop, for example, this is noticeable negatively – not to mention article names that wind around the screen.
  • Unfortunately, images are typically not available on the web – optimized for small and large screens, for detail or list display. This is where automated image processing algorithms come into play, which have to cope with very different raw materials.

Last but not least, there is a series of additional information which in turn requires data to be added by other sources. For example, we have our own data pools with information on manufacturers, associations or origin marks.

In order to make the use of all these information details as easy as possible, the InfoPool module is available in the PCG. Not only can groups of articles be edited en-mass, but you can also simplify your work with the help of simple dependency rules. Finally, the information must be correct, even if the manufacturer changes the formulation of a cream, for example.

The different pools can be activated in the shop settings. For fast results, the system can easily add missing information from these data sources, e. g. by assigning them via GTIN. There are also statistics on their use and query options, which can be used during data entry and for checking purposes.