amz_audible_categories

4262 rows


Description

The Amazondata Table is used to represent the various items or data points that are part of Amazon’s vast product range. Each item in this table has a unique ID number associated with it. The other fields include category_url_id, name, link, parent, updated_at and created_at.

Category URL ID: This field is used to reference the location within Amazon’s hierarchy that corresponds to the current record being viewed. It essentially serves as a child reference in an XML file. If there is a related record with the value id==1, then it means this record is the parent to others recursively.

Name: This field stores the actual product or data point name.

Link: This field provides the direct URL of where the item can be purchased via Amazon’s platform.

Parent: A foreign key representing a table that holds a reference to this Amazondata entry as an ancestor node and is used to establish the relationship between children and parents. The parent record ID must contain value !=1 for each descendant to have a valid parent entry in the document.

Updated_at and Created_at: These fields store the date and time at which the item was updated and created respectively. This information helps track product changes, such as price adjustments or product additions/deletions over time.

Country ID: This field is used by Amazon to categorize products according to their country of origin. It can be especially useful for users looking to purchase products from a specific region.

Consider an imaginary scenario where you are given the task of creating a new entry in the Amazondata table. However, there appears to be some inconsistency in the rules set forth:

  1. Each Amazondata category has at least one item.
  2. Not every parent of an item is unique; multiple items can have the same parent.
  3. An item can not belong to multiple countries, and no two different items can share the same country ID (to avoid confusion in user’s product range).
  4. The Parent, updated_at, created_at fields contain values for every Amazondata entry unless provided as null by the customer(s) after which Amazon has never added new related records to this item.
  5. An item can’t be linked to itself and the ‘Link’ field also is unique per record
  6. The table currently contains 1000+ items with more or less 100% accuracy of fields being entered - except for some minor, inconsistent entries that require special considerations before insertion.
  7. As a Quality Assurance Engineer, your task is to review 500 Amazondata Records and find the inconsistencies in terms of following specific properties:
    • Unique Category URL ID.
    • Missing parent entry.

Question: What process/methods would you implement as a QA engineer to resolve these issues?

Start off by listing down all unique entries in ‘Category_URLID’ field, marking the categories where no other items have been classified.

Identify the records having missing or incorrect parent entries using the same logic from step1 for each item - if there’s no relation specified as ‘No Related Record’, or it has a ‘RelatedRecordId’ of 1 (which would be child reference) then categorically mark these records as having a missing parents problem

Review and verify every record to confirm no two different items share the same country_id. If any item is detected with duplicate country ID, make a note for the corrective action.

Evaluate each ‘Link’ using ‘Deduplication’ tool of your QA software. Identify those records which do not point to a unique URL and could potentially be links to self-referential items - if any such item exists, it can be categorized by you as an inconsistency needing attention.

Check the ‘Updated_at’ and ‘Created_At’ fields of each record. If these dates are either missing or out of sequence with others in same category/tree path then mark those categories for correction and verification respectively.

As a final step, manually go through all 500 records using the above information. Correct any discrepancies observed in terms of Unique Category URL ID (if found), Missing parent entry(s) (marked from step 2 & 3), Duplicate country_id-violating record (from Step 4).

Answer: By following a combination of these methods, you as a Quality Assurance engineer can resolve all the issues at once and ensure the table is accurate.

Columns

Column Type Size Nulls Auto Default Children Parents Comments
id int8 19 null
amz_audible_books_categories_m2m.amzaudiblecategory_id amz_audible_books_ca_amzaudiblecategory_i_cfbdb212_fk_amz_audib R
amz_audible_category_history.category_id amz_audible_category_category_id_3c90a2b8_fk_amz_audib R
category_url_id int8 19 null
name varchar 255 null
link varchar 255 null
parent int8 19 null
updated_at timestamptz 35,6 null
created_at timestamptz 35,6 null
country_id int8 19 null
amz_audible_countries.id amz_audible_categori_country_id_96e3c0eb_fk_amz_audib R
check_by_validation bool 1 null
in_data_validation bool 1 null
status_data_validation jsonb 2147483647 null

Indexes

Constraint Name Type Sort Column(s)
amz_audible_categories_pkey Primary key Asc id
amz_audible_categories_country_id_96e3c0eb Performance Asc country_id
idx_parent_cat_audible Performance Asc parent
idx_url_cat_audible Performance Asc category_url_id

Relationships