Summary

This article explains the concept of a binary search tree, its characteristics, and use cases, with a focus on its application in efficiently retrieving information about Global Trade Item Numbers (GTINs) in a webshop.

Abstract

The article begins by defining a tree data structure and its advantages over linear data structures. It then introduces the binary tree and binary search tree concepts, highlighting their characteristics and benefits, such as fast search, insert, and delete operations. The article then presents a use case for binary search trees in a webshop that wants to efficiently retrieve information about GTINs using a binary search algorithm. The solution involves creating a Product Class for the data contained in a Node, a NodeP Class to store a list of Products, and a TreeP Class to implement an abstract data type called binary search tree. The article also discusses the issue of unbalanced trees and presents the red-black tree technique as a solution. Finally, the article provides an implementation of a binary search tree and its performance analysis in terms of comparisons.

Opinions

The author believes that tree data structures are more efficient than linear data structures for implementing algorithms.
The author emphasizes the importance of maintaining an ordered binary search tree for efficient search, insert, and delete operations.
The author suggests that the binary search tree is a suitable data structure for efficiently retrieving information about GTINs in a webshop.
The author acknowledges that binary search trees can become unbalanced when the values to be inserted are already ordered, leading to slower data retrieval.
The author proposes the red-black tree technique as a solution to the issue of unbalanced trees.
The author provides an implementation of a binary search tree and its performance analysis in terms of comparisons, highlighting its efficiency in retrieving data.
The author encourages readers to learn more about traversing trees and other data structures and algorithms.

Tree data structure: Binary Search Tree

Tree

A tree is a data structure that consists of nodes connected by edges.

Tree data structures are non-linear data structures, and they allow us to implement algorithms much faster than when using linear data structures.

Binary tree

A Binary tree can have two children: a left node and a right node. Every Node contains two elements: a key used to identify the data stored by the Node and a value that is the data contained in the Node. The following figure shows the binary tree terminology.

Binary search tree

The most common type of Binary tree is the Binary search tree, which has two main characteristics:

· The left Node's value must be less than its parent's value.

· The value of the right Node must be greater than or equal to the value of its parent.

Moreover, you can search in a tree data structure quickly, as you can with an ordered array, and insert and delete items quickly, as you can with a linked list.

Finding a value takes a maximum of log2(N) attempts. As the collection of nodes gets large, the binary search tree becomes faster over a linear search which takes up to (N) comparisons.

Binary search tree: Use Case

A company can use the Global Trade Item Number (GTIN) to identify its trade items uniquely.

The GTIN can identify the types of products that different manufacturers produce.

A Webshop wants to retrieve information about GTINs efficiently using a binary search algorithm.

Solution

We create a Product Class, the data contained in a Node.

We create a NodeP Class to store a list of Products. Moreover, this Class allows us to have two NodeP attributes to hold the left and right nodes.

We create a TreeP Class to implement an abstract data type called binary search tree, which includes a NodeP root variable for the first element to be inserted. We need to implement an insert method, where every time a new GTIN is inserted, it compares the current GTIN versus the new GTIN. Depending on the result, we store the new GTIN on the left or right Node. In this way, the insert method maintains an ordered binary search tree.

A Test case to verify our assumptions:

And here is the implementation for the insert method:

We create a Test case for the find method.

And here is the implementation that shows a find method (the gtin parameter is our key), which iterates through all nodes until a GTIN is found. This algorithm reduces the search space to N/2 because the binary search tree is always ordered.

This Binary Search Tree works well when the data is inserted randomly. Therefore, when the values to be inserted are already ordered, a binary tree becomes unbalanced. With an unbalanced tree, we can not find data quickly.

One approach to solving unbalanced trees is the red-black tree technique, a binary search tree with some special features.

Assuming that we already have a balanced tree, the following algorithm shows us how fast in terms of comparisons could be a binary search tree that depends on a number N of elements. For instance, in 1 billion products, to find a product by GTIN, the algorithm needs only 30 comparisons. See Big O Notation.

Here is our assumption.

And here is our implementation.

Learn to traverse a Tree in this link and the inner workings of other data structures and algorithms easily.

If you consider that one of my stories was useful and want to support me as a writer, join as a Medium member with my referral link!.

I like to share my knowledge and experience in Java Programming, Big O Notation, Algorithms, Coding interview, Data Structures, Clean Code, SOLID Principles, Distributed Systems, Spring Framework, Web Services, and other Tech topics.

Originally published at https://codersite.dev on May 12, 2020.

Join Medium with my referral link - Moises Gamio

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

medium.com