LABIC - Bioinformatics and Computational Intelligence Laboratory

Local Repository of Research Datasets


UTFPR-ABCD: Amazon Book Covers Dataset


 

  1. Objective:

The dataset was created using the Amazon virtual bookstore. All the relationships found are a result of visiting Amazon's website and registering product recommendations, which in the case of this dataset are books. The dataset was created to enable data mining studies related to the features of the book covers to their popularity and sales.

  1. Data Description:

The dataset contains over 180 million relationships between a pool of almost 6 million objects. These relationships are a result of visiting Amazon and recording the product recommendations that it provides. For our approach we filter the products of the book category, and the following steps were carried out:

The total of books obtained up to this point was 59173, a sample of the dataset can be seen in the Figure below, where:


In addition, color and object characteristics of the book covers were extracted, respectively using colorgram (python) and the Yolo neural network. These features were included in our database as it is also shown in the Figure below.

  1. Link to the dataset

    (soon it will be available for download)

  1. Related Papers: