🤖Implement your first AI
A Machine Learning approach in predicting diabetes
We will go through the whole process of an implementation for this Framework:
The Problem
Firstly, we have to think about the Problem we want to solve: For this Example we will try to predict if someone has diabetes based on some simple input
Pregnancies
Glucose
BloodPressure
SkinThickness
Insulin
BMI
DiabetesPedigreeFunction
Age
The needed dataset can be downloaded here
Step 1. Input Parsing
Data Object
Let's start with parsing the Input in a Java Class. For this, we create a class called DiabetesDataSet.
It will be used to hold the data.
public class DiabetesDataSet {
private final int pregnancies;
private final int glucose;
private final int bloodPressure;
private final int skinThickness;
private final int insulin;
private final double bmi;
private final double diabetesPedigreeFunction;
private final int age;
private final boolean diabetes;
public DiabetesDataSet(String[] values){
pregnancies = Integer.parseInt(values[0]);
glucose = Integer.parseInt(values[1]);
bloodPressure = Integer.parseInt(values[2]);
skinThickness = Integer.parseInt(values[3]);
insulin = Integer.parseInt(values[4]);
bmi = Double.parseDouble(values[5]);
diabetesPedigreeFunction = Double.parseDouble(values[6]);
age = Integer.parseInt(values[7]);
diabetes = values[8].equals("1");
}
public Vector toVector(){
return new Vector(pregnancies,
glucose,
bloodPressure,
skinThickness,
insulin,
bmi,
diabetesPedigreeFunction,
age);
}
//Getter and Setter...
}
Reading the Input
As the next step we will read the data from the file.
In Java, there are several ways to read data from a File. Since Java 8 we do have the convenient method Files.lines(...)
though, which returns us a Stream of each line of the file. Using this the resulting class InputParser
may look like this:
public class InputParser { private static final Path dataOrigin = Path.of(<source>); public static final int amountOfUnseenData = 100; private final List<DiabetesDataSet> trainingsData; private final List<DiabetesDataSet> unseenData; public InputParser() throws IOException { List<DiabetesDataSet> inputData = Files.lines(dataOrigin) .skip(1) //Skip the headers of the .csv file .map(set -> set.split(",")) .map(DiabetesDataSet::new) .collect(Collectors.toList()); //We don't want our Algorithm to just "remember" the data //Therefore we split the data up in trainings data and in unseen data, //with which we can check our result afterwards this.trainingsData = inputData.subList(amountOfUnseenData, inputData.size()); this.unseenData = inputData.subList(0, amountOfUnseenData); } //Getter and Setter }
Let's add one more method that returns us one random element from our trainings data:
private final AtomicBoolean lastOneHadDiabetes = new AtomicBoolean();
private final Random random = new Random();
public DiabetesDataSet getRandomTrainingsDataSet(){
DiabetesDataSet returnValue = trainingsData.get(random.nextInt(trainingsData.size()));
while(returnValue.hasDiabetes() == lastOneHadDiabetes.get())
returnValue = trainingsData.get(random.nextInt(trainingsData.size()));
lastOneHadDiabetes.set(returnValue.hasDiabetes());
return returnValue;
}
Step 2. Defining a Fitness Function
The next thing is to define the Fitness Function for our Diabetes Prediction. This is actually much easier than it may seem. Considering there are only 2 outcomes (has diabetes or not) we may only return 1 or 0 as fitness values.
final InputParser inputParser = new InputParser();
NeuralNetFitnessFunction fitnessFunction = (nn) -> {
DiabetesDataSet data = inputParser.getRandomTrainingsDataSet();
//Due to the default activation function in the Neural Network the output is between 0 and 1
//If the output is greater than 0.5 we will interpret this as "Patient has diabetes"
boolean prediction = nn.calcOutput(data.toVector()).get(0) > 0.5;
//If the prediction was correct return 1 otherwise return 0
return prediction == data.hasDiabetes() ? 1 : 0;
};
Step 3. Create and Configure the Algorithm
The Brain (Neural Network)
To create a brain for our Machine Learning Algorithm, we will have to create a NeuralNetSupplier
using the Builder
of the NeuralNet
class.
NeuralNetSupplier neuralNetSupplier = ( ) ->
new NeuralNet.Builder( 8, 1 )
// .addLayer(4)
.build( );
NeuralNetSupplier
This class will provide our Algorithm with its first population. We did actually already implement everything we need for this:
public static final int POP_SIZE = 1000;
//...
NeuralNetPopulationSupplier nnPopSup = new NeuralNetPopulationSupplier(neuralNetSupplier, fitnessFunction, POP_SIZE);
Selector, Recombiner and Mutator
We now have to decide which implementations we want to use for those Interfaces. Feel free to alter the values or choose even completely other implementations!
private static final Selector<NeuralNetIndividual> SELECTOR = new EliteSelector<>( 0.3 );
private static final Recombiner<NeuralNetIndividual> RECOMBINER = new NNUniformCrossoverRecombiner(2);
private static final Mutator<NeuralNetIndividual> MUTATOR = new NNRandomMutator( 0.3, 0.10, new Randomizer( -0.5, 0.5 ), 0.01 );
If you dont know what they do this is no Problem. Just copy them and try out some stuff on your own. You cant break anything, all implementations work with every other!
Genetic Algorithm
Now we can put everything together with the Builder of the GeneticAlgorithm
class
public static final int GENS = 2000;
//...
GeneticAlgorithm<NeuralNetIndividual> geneticAlgorithm = new GeneticAlgorithm.Builder<>(nnPopSup, GENS, SELECTOR)
.withRecombiner(RECOMBINER)
.withMutator(MUTATOR)
.withMultiThreaded(16) //uses 16 Threads to process
.withLoggers(new IntervalConsoleLogger(100)) //used to print logging info in the console every 100th generation
.build();
With geneticAlgorithm.solve()
we finally start the solving process!
As a return value, we get our trained Neural Network.
Step 4. Check the result
Hurray you wrote your first AI! But wait, so far we only trained it and have not used it yet! Exactly for this, we saved 100 unseen datasets earlier in this tutorial. Let's write a method that tests our new trained Neural Network:
public static void testModel(InputParser inputParser, NeuralNet model){
long correctGuesses = inputParser.getUnseenData()
.stream()
.filter(data -> {
boolean prediction = model.calcOutput(data.toVector()).get(0) > 0.5;
System.out.println("prediction is " + prediction + " - patient has diabetes: " + data.hasDiabetes());
return prediction == data.hasDiabetes();
})
.count();
System.out.println("The model guessed " + ((100d*correctGuesses)/InputParser.amountOfUnseenData) + "% correct");
}
It basically works similar to our Fitness Function we wrote earlier. The only difference is that we are counting the correct results, and we added some extra logging as well.
Step 5. Run the code
It's finally time to execute our code. To do so, just call the geneticAlgorithm.solve()
method and give its result to our test method:
NeuralNetIndividual result = geneticAlgorithm.solve();
testModel(inputParser, result.getNN());
For me the test method always yielded an accuracy of roughly 65%- 70%.
The model guessed 67.0% correct
Last updated
Was this helpful?