avatarRobert Shaneyfelt

Summary

This article provides a comprehensive guide to creating a machine learning application for predicting music preferences using Python, pandas, and scikit-learn.

Abstract

The article titled "Complete Example of Machine Learning" under the broader category of "Python and Artificial Intelligence" offers a detailed walkthrough for developing a predictive model for an online music store. The author, Robert Shaneyfelt, outlines the steps to follow in building an AI application, including importing and cleaning data, splitting data for training and testing, selecting an appropriate machine learning algorithm, training the model, making predictions, and evaluating the model's performance. The example uses a decision tree algorithm from the scikit-learn library to predict users' music preferences based on age and gender. The data, initially stored in Excel CSV files, is manipulated using the pandas library, and the Jupyter code editor is recommended for its ease of use in Python and machine learning projects. The article emphasizes the importance of iterative improvement in AI applications and provides code snippets and output examples to illustrate the process.

Opinions

  • The author emphasizes the practicality of Python for AI applications, noting its popularity and the availability of libraries like pandas and scikit-learn.
  • The use of a decision tree algorithm is suggested for its simplicity and effectiveness in the given example, but the author acknowledges that other algorithms like neural networks might be necessary for more complex tasks.
  • The article suggests that machine learning can significantly enhance an online music store's ability to predict and cater to user preferences, potentially increasing sales.
  • The author makes assumptions about music preferences based on age and gender, which may not hold true in all cases but serve as a starting point for the model.
  • Anaconda and Jupyter are recommended as valuable tools for setting up a development environment for machine learning projects.
  • The importance of iterative development is highlighted, with an emphasis on evaluating and improving the model based on prediction accuracy.

Complete Example of Machine Learning

Python and Artificial Intelligence

Photo by Arseny Togulev on Unsplash

“Machine learning is the science of getting computers to learn without being explicitly programmed.” — Sebastian Thurn

For further information on artificial intelligence, first AI story.

In this story, I will walk you through a complete coding example of a machine learning application. This application will be for an online music store that needs a reliable way to predict what kind of music its users are interested in.

The steps to follow in developing an artificial, intelligence application, using python, was mentioned in my first story. I mention them again.

  1. Import the data
  2. Clean the data — remove duplicate data. If the data is text-based. Convert the data to numerical, values.
  3. Split the data into training and test sets — Make sure our model produces the correct result.
  4. Create a model — Select an algorithm to analyze the data. Decision trace, Neural networks… Each algorithm has pros and cons. What makes python such a popular language in AI is some libraries already exist that implemented many of the algorithms. The library I will use is pcikit-learn.
  5. Train the model.
  6. Make predictions.-When you start, your predictions are likely inaccurate.
  7. Evaluate and improve.

My initial story contains the link to acquire the python programming language,

The initial data will be derived from the users of the music store it currently has. The main purpose of this sample application will be to increase music sales. It should reliably predict what kind of music each new user likes.

The final data will come in the form of an Excel spreadsheet or CSV file. This is a popular format in which initial data is stored, one location that provides data like this is kagle.com.

I will use the python programming language and two common libraries used on AI,

Pandas — A data analyst library that provides a concept called data framing. A data frame is a two-dimensional object, similar to an Excel spreadsheet.

Scikit-learn — provides algorithms such as decision trace and neural networks.

I will use the Jupyter code editor. a good code editor for python and machine learning projects called. Jupyter makes the inspecting of data much easier.

It’s best to use anaconda to install Jupyter, for application development similar to this. Anaconda is available here.

I created three, initial data with the free, open-source program LibreOffice. A CSV file is just a text file viewable and editable with notepad, VI, or Word pad. The data in these excel CSV files are loaded onto computer memory with the python library pandas. The input .csv file and the output .csv file is merely the user data profile data called musin.csv split onto input and output sections.

There are three columns of data. The first two are the users, age, and, gender. This was split off into the input. In the third column, the genre was split off into the output.

The three excel CSV files, for use in the program, are based on several assumptions. I then split the data into an output portion and an input portion to offer to feed the decision tree algorithm.

Keeping in mind that python is an interpreted language, I display the code, the content of the original CSV file, and the contents of the split CSV files.

The assumptions made were that males younger than 26 years old preferred hop-hop. Males between 25 and 30 liked jazz, and males over 30 liked classical. Females younger than 26 liked the dance genre, and females between 25 and 30 liked acoustic. Females over 30 liked classical.

For gender, a 1 means male, and a 0 means female.

Music.csv

age, gender, genre

20,1,Hip-hop

23,1,Hip-hop

25,1,Hip-hop

26,1, jazz

29,1, jazz

20,0, jazz

31,1,classical

33,1,classical

37,1,classical

20,0,dance

21,0,dance

25,0,dance

26,0,acoustic

27,0,acoustic

30,0,acoustic

31,0,classical

34,0,classical

35,0,classical

For the model, the example uses the Decision tree algorithm. The decision tree algorithm comes from the decision tree class contained in the learn module contained in the Scikit-learn library. The use of the decision tree algorithm takes input and output data in order to form its prediction. If the results look bad, you can try using the neural network algorithm.

Noting, there is no data for a male twenty-one years old, or a female. Twenty-two years old. The model was asked for predictions of these genders and ages. So the machine learned the music preferences of these new users, and their data could be added, resulting in machine learning.

Predictions = model.predict([ [21,1], [22, 0] ] ) print(predictions)

['Hip-hop' 'dance']

Below is the entire code along with the displayed output listed in the interpreter.

Import pandas as pd from sklearn. Tree import DecisionTreeClassifier

music_data = pd.read_csv(“music.csv”)

input = pd.read_csv(‘InputMusic.csv’) print(input)

age  gender
0    20       1
1    23       1
2    25       1
3    26       1
4    29       1
5    20       0
6    31       1
7    33       1
8    37       1
9    20       0
10   21       0
11   25       0
12   26       0
13   27       0
14   30       0
15   31       0
16   34       0
17   35       0

In [43]:

output = pd.read_csv('outputMusic.csv')
print(output)
genre
0     Hip-hop
1     Hip-hop
2     Hip-hop
3        jazz
4        jazz
5        jazz
6   classical
7   classical
8   classical
9       dance
10      dance
11      dance
12   acoustic
13   acoustic
14   acoustic
15  classical
16  classical
17  classical

In [44]:

model = DecisionTreeClassifier()
model.fit(input, output)
predictions = model.predict([ [21,1], [22, 0] ] )
print(predictions)
['Hip-hop' 'dance']

In [45]:

print(music-data)
age  gender      genre
0    20       1    Hip-hop
1    23       1    Hip-hop
2    25       1    Hip-hop
3    26       1       jazz
4    29       1       jazz
5    20       0       jazz
6    31       1  classical
7    33       1  classical
8    37       1  classical
9    20       0      dance
10   21       0      dance
11   25       0      dance
12   26       0   acoustic
13   27       0   acoustic
14   30       0   acoustic
15   31       0  classical
16   34       0  classical
17   35       0  classical

Copyright © 2022, Robert Shaneyfelt All rights reserved

Artificial Intelligence
Illumination
Programming
Machine Learning
Writing
Recommended from ReadMedium