PRATIK JAWAHAR
  • Home
  • Research Work
  • About
  • PRAkTIKal-ly Speaking
  • Contact
  • Resume

The wait is over!
Peek inside PRAkTIKal's HQ!

How I made a dumb F1 Vietnamese GP Prediction Algo:

4/5/2020

0 Comments

 
Picture
TL;DR:
  • Automate data scraping from race stats, official website using Selenium, BeautifulSoup/LXML(Both are similar) packages. To get data from all races in 2018, 2019
  • Clean data -> min-max normalization -> augment the 44 data points to ~4000 samples using simple mean, median shifts
  • Sliding window transformation to make it look like a supervised training problem
  • Train a shallow CNN+LSTM regressor with very small hidden layer sizes, because data already has strong correlation => high generalization
  • Save weights at train accuracy of 50%(very low learning rate=>barely any weight updates){Lap 10 prediction}, 60%{Lap 20..so on}, 70%, 80%, 90%, >90%.
  • Add randomness using weighted regularization of random variables such that prediction does not converge to final result in first pass
  • Add randomized weights (pos or neg) to predictions for Max Verstappen since he was chosen as the wild card. 60% chance of weight being positive (Cause the poor kid was starting from the back of the grid)
  • Et voila! C'est tout.

Algo, without the jargon (tried my best to avoid):

Friday 3:30 am
Location: Somewhere near Boston, USA:

The phone had enough charge so I could turn over to the other side, watching F1 highlights on Insta, frustrated about the effect of COVID-19 on the F1 2020 calendar. Nothing but this, tormented me more about the current situation of the world (Recession, Deaths, Rising cases, China re-opening their live markets, Toilet pap... oh wait I'm Indian). i couldn't care less about any of this, but a whole F1 season in jeopardy?! Not having it. Ergo, F1_Predictor.py was born.

Getting Data: ​
  • Couldn't find a readily available dataset online (Frankly didn't look much, too much effort)
  • Wrote an automated web scraper to automatically log into the official website, navigate into the results page for each race and extract data from the tables. This is super easy for the F1 official website because the site doesn't have active scraper protection methods and is all written in standard html code.
Forming the Dataset: 
  • Cleaned extracted data (Replaced 'driver names' with integers from 0-19, replaced 'DNF' to -1) to make it easier to work with
  • Normalized data (Min-Max Normalization), for easier calculation of probabilities. In very loose terms, this is required to bring all the data into the same range (0-1) so its easier for the network to find correlations and learn patterns. These patterns are then used to make the predictions
  • The network wont learn shit with just 44 data points to train on, so augmented data to ~4000 samples using a naive mean/median-shift algo.
    • The mean of each race is calculated for each feature column
    • The mean is shifted randomly by a factor of 1E-6 and reconstruct each column to match the same distribution
    • Do the same for median of each column in each race
    • Frankly don't know how effective/correct this is, but it gave kinda realistic results (After implementing my cheats explained in the Cheats section :P)​
  • ​Sliding window transformation to make the problem easier to deal with: To bring in factors like driver's morale from the previous race and modelling its effects on the current race is important, but difficult to add to the data explicitly. So we make a sliding window transformation:
    • Say, ​available data points: [race 1, race 2, race 3, race 4, race 5....] 
    • The networks inputs (loose terms) would be:
      • Input 1: [race 1, race 2, race 3]; Output 1: [race 4]
      • Input 2: [race 2, race 3, race 4]; Output 2: [race 5]
      • ..... so on
    • ​This helps the network learn if there were any effects of race 1, race 2, race 3 on race 4; Effects of race 2, race 3, race 4 on race 5.... and so on.
The Network: 
  • Supervised training is generally(don't hold me to it) performed for two major tasks (Classification, Regression). Very basic idea:
    • Classification: A network trained to classify dogs and cats, when given an unseen image, tries to check if its closer to a dog or a cat, and gives a class prediction accordingly
    • Regression: Given a set of data points, the network tries to fit a curve that most closely represents the data distribution. This curve can then be extrapolated to make future predictions
  • We are not trying to classify anything in our case, but trying to predict future results, so our network could loosely be called a regressor (More accurately: a time-series predictor)
  • Explaining the network without jargon is too much work so go here to read about CNNs and here to read about LSTMs
  • The network itself: 3 Conv Layers, Feeding into a single LSTM layer, Feeding into Dense layers that spit out predictions

  Cheats: 
  • Given the strong correlation in the existing data (eg. Hamilton has just won way too many times for the network to not predict straight away that he'd be on P1 and remain that way for the entire race). That is sadly true, but it's no fun if the network also mirrors the bitter reality. Hence the following cheats:
  • Regularization: Adding a random weight to certain parameters in the loss function. So basically, the loss function is like a combined representation of what's going on in the network at each step. The aim of the network is to do whatever it takes to minimize the loss function, and whatever the network did to achieve minimum loss is what we want the network to learn.
    • ​The problem with this: For the given data there is way too many patterns that the network could learn in a minute, and cause the network to converge to the final solution extremely fast. This is because it is a given that Hamilton, Max, Vettel would always be in the top order, and George Russel could never win! But that's no fun! There needs to be some random drama.
    • Regularization generally adds a high weight to unnecessary parameters like noise to tell the network to stop focussing on the noise and trying to minimize the more important features. Otherwise the network only minimizes that which can be easily minimized.
    • The way I used it though, essentially adds a weight to random terms in the loss function to make sure they aren't minimized too fast.
    • This is why Sainz went all the way up to P4, else the network would've predicted a P6/P7 for him right at the start.
  • Adding a wild card agent: The most interesting races have that one unsuspecting driver who ends up achieving something miraculous. Races that finish exactly the way they started are called ummm... what's that phrase?! oh yeah! FUCKING BORING!
    • So I took a poll and Max ended up getting the Max votes...cause Max -> Max.... NVM.
    • What that basically meant is whatever predictions max got, it was multiplied by a random factor that could be positive of negative. This means he would be either the star of the show or the mega crash of the day, left to a coin toss
    • In the first 10 laps he got a positive weight causing him to jump 7 positions in just 10 laps. (Which is not impossible for Max :P)
    • Between 10-20 he jumped 4 positions which was what the network predicted cause the random weight was close to 1
    • Between 20-30 the network predicted him to go up 5 places (which is not too big a stretch for Max, but c'mon), and the negative weight meant he only jumped 2 positions. The same happened till lap 50.
    • Between 50-60 the network predicted P4 -> P2 but again, a negative weight nullified the prediction and he finished P4
    • If not for the randomized weights Max would've gone P20 -> P2 in 20 laps. Could've been fun to watch if he actually pulled it off, but let there be some amount of reality in what we're doing :P​
Picture
Et Voila! We've had our first Vietnamese GP in a while. COVID-19 can go fuck itself.
0 Comments

"F1 is simple af! Just a bunch of cars trying to go faster than the others..pfft, Easy!". Let me welcome you to the world of strategy in motorsport! Beginner's Guide to "Undercuts" in F1:

1/24/2020

0 Comments

 

If you're thinking what on Earth is going on in this video, this article is for you! 

When it comes to F1 and motorsport in general, not a lot of people know about the myriad strategies that teams go through over the course of a single race. This is primarily because F1 strategies are usually masked and disguised by codes and technical jargon such that the other teams don't get to know them. This is what gives the sport an entire new dimension, that a NOOB like you, (or even me a few years ago) is completely oblivious to. Yes, you heard it! F1 is not just a bunch of cars running around in circles! So let us look at the most common and the most widely used strategy for starters. Behold, "The Undercut" *cue mysterious smoke, dramatic music*. 

To understand the undercut, you must first know the basics of an F1 pitstop. So why do cars go into a lane all of a sudden and a bunch of superhumans change all 4 tyres in 2 secs to watch the car zoom back out?
Picture

The Pit-Stop

F1 in today's day and age permits drivers to pit the car during the race for a bunch of reasons. The most common being for a change of tyres, but pitstops could also be for other reasons such as a nose change in case your front wing is damaged. Let us stick to tyre changes because that is one of the key factors that affects an undercut. "Why come to a stop and change tyres? Won't you win the race if you just keep going start to finish?"
Picture
Modern day F1 operates on three main tyre choices that all teams have. The tyres shown above represent the Hard Compound (white ring), Medium Compound (yellow ring) and the Soft Compound (red {bull :P} ring). When these tyres are fresh, they provide optimal traction. Think of traction like the frictional force that keeps you stuck to the track and prevents you from skidding off into the walls. With each lap however, at the speed that F1 cars operate on (top speeds over 300 Km/h) the friction between the track and the tyres, causes wear and thereby reducing the tyres performance and the traction it can provide. This means you have to take slower turns to avoid skidding off the track resulting in a loss in track time. Each compound has its own level of traction/wear ratio, bringing in a lot of track based strategy. The traction/wear ratio goes down as you go up the compound types, meaning soft tyres give more cornering traction but also wear out faster, and hards provide comparatively lesser cornering traction but take longer to wear out. This is essentially why you can't run a whole race (usually around 60 laps more or less) on a single set of tyres.

How do you combat that? There you go! Pit-Stops.   

Quick Guide to the Pit-Stop

  • When to pit is decided by the team engineers and the drivers
  • Pitting requires the driver to drive through the "pit-lane" to the teams garage location
  • Drivers have to limit their speed to 80 km/h while in the pit-lane (to ensure safety of by-standing pit crews). This rule was introduced after an accident at Imola in 1994. Before that drivers used to enter the pit-lane at full speed with tons of people standing a few feet away.
As a result of these rules, though the superhuman pit-crews change all 4 tyres in under 2 secs, the entire pitting maneuver costs drivers around ~20 secs of track time. This means when a driver pits, any car that's within ~20 secs behind him, will overtake him by the time he comes back onto the track. So teams have to make strategic decisions on when to pit. This is where the undercut comes into play.

The Undercut, finally!

Imagine you are Sebastian Vettel, roaring in a damn cool SF90 for Scuderia Ferrari through the streets of Singapore. You are on P2, with a young and fiery Charles Leclerc ahead of you in his SF90. You are trying really hard to get past him but he just does not let you get by! You think you could go faster but Leclerc is causing you to hold back and not go full power cause you don't want to crash into him and he just wont let you by. So what do you do?
  1. You pit before Leclerc.
  2. You go full on once you're back out on the track
"WHAT?! How stupid can you get?! You just said you lose like 20 secs if you pit. So why would you do that if you wanna overtake the guy ahead of you?!"

Let us say Leclerc, 4 secs ahead of you, is lapping in 100 secs (All numbers exaggerated for representation purposes, don't hold me down on this :P). You can't really lap much faster than that if you are constantly racing right behind him, but you think you can do way better than 100 secs. You think you could go faster if you just had some more space on the race track. So you pit before him, creating a good distance between the two of you. You did lose some time doing this, but keep in mind, he will also have to pit at some point! So you're back on track with fresh tyres so you start doing each lap in 97 secs. Leclerc however, though he is still ahead, is doing slower and slower owing to deteriorating tyres. He's doing 101, 102, 103 secs on each successive lap. So you lost 20 secs doing the pit stop, but you gained 6 secs over what Leclerc is lapping now. Now when Leclerc pits, he too loses the 20 secs you lost, cancelling out the effects of the pit-stops. However, he was initially 4 secs ahead of you, and by doing the undercut you gained 6 secs on his time before he pits. Meaning when Leclerc comes out of the pit, you've gained 2 secs on him. Et Voila! You have successfully undercut Leclerc to take the lead of the race! Time for you to say "Grazie Ragazzi!". Watch a live replay of your amazingly strategic overtake below and applaud yourself!
0 Comments

I barely know what Formula 1 is! What on Earth is Formula E?! Beginner's Guide to Formula E:

1/23/2020

0 Comments

 
Picture
Picture
Formula E is the all-electric vehicle racing series run by the Fédération Internationale de l'Automobile (FIA). I know what you're thinking... If you're well versed with F1, you're probably going "No roaring V-10s?? Pfft". My only reply to that would be, No! We are not in 1989. Today's F1 cars run 1.6L V6 engines, sparking the outrageously hilarious "Bring back the f***ing V12s" comment by Sebastian Vettel after an MGU-K problem during his Russian GP this season. Before I digress any further, let me request you to stay with me while I tell you what FE is and let the videos do the convincing!

Formula E does run on all-electric vehicles and that's the whole point of the series! To have an emission-free racing series! I agree moving the cars from venue to venue costs a lot of unnecessary emissions and blah, but will you just let that be for a while! :P 

So here's how the racing format goes:
  • There is a Qualifying Segment to decide the order in which the cars start and the Race Segment itself. Unlike F1 where Qualifying takes place a day before the race, both the Qualifying and Race take place on the same day for FE! The FE Qualifying may also be preceded by one or more sessions of Free Practice. The best part, all races take place on street circuits, meaning tightly packed groups and guaranteed nose to tail racing paving way for action all throughout the race! (Not the 'bleh' that F1 has been engulfed in recently, with Merc winning just about everything)
  • Qualifying:
    • Drivers are split into 4 groups for the first round of qualifying. Each group of drivers goes out together and gets one flying lap to set the fastest time. The top 6 drivers are selected to compete in the Super Pole, while the others are lined up at the back according to their Quali times.
    • Super Pole: The top 6 drivers get one more flying lap to decide who gets pole position and the rest are lined according to their Super Pole times within the top 6.
Picture
  • Race: 
    • Shortly after qualifying, the race begins. This is where the main strategic components come into play! Unlike F1, the drivers do not drive for a fixed number of laps, but the race goes on for {45min+1lap}. Huh?!
    • The timer starts as soon as the race starts. At the end of 45 mins, the leader of the race completes the lap he is on and starts off the final lap when he crosses the start point.
  • Teams: 
    • The number of teams may vary each season, but each time is composed of 2 cars and 2 drivers! (No! they do not race autonomous cars. That's a job for RoboRace, a new autonomous racing series! Go check it out!!)
    • There are two types of teams: Manufactures and Customers. Find out more about their differences in the "Cars" segment below.
  • Drivers: 
    • FE features drivers from all sectors of the motorsports world. Includes F1 veterans like Felippe Massa, Jean Eric Vergne (two time reigning FE Champion), Le Mans champions like Sebastien Buemi, Andre Lotterer, to young rookies like Nick de Vries (reigning F2 champion) and Pascal Wehrlein.
  • Cars:
    • The series now features the ravishingly stunning Gen 2 cars, that are significantly improved versions of the Gen 1 cars which were used for the first 4 seasons of FE.
    • All teams have to abide by and use identical Batteries, Chassis and Body Work, to level the playing field.
    • Manufacturer teams have the liberty to change some minor components such as the driving motor, or build their own sub-systems such as the brake-by-wire tech.
    • Customer teams can buy either sub-systems or entire cars from manufacturer teams to compete.
    • There are no minimum number of pit-stops, and cars usually make pit stops only if there is severe damage to the nose section.
    • The cars run the whole race on a single set of all-weather tyres, meaning tyre changes/pitstops do not serve as strategic factors.
    • There is no rear wing, paving way for nose to tail racing and tight packs from start to end. Instead there are massive diffusers at the back that provide the necessary down force.
  • Major Strategic Components:
    • Attack Mode: This is a special provision in FE, where drivers can choose to activate the "Attack Mode". This gives the drive an extra 35kW of power for a specific amount of time. However, to enable attack mode, the driver has to drive through a specially marked segment on the track that is away from the racing line, meaning the driver loses some track time in activating attack mode. Another catch is that, the teams and drivers are given details of attack mode for each race only 60 mins prior to the start of the race. These details include where the attack mode enabling zones are, what the minimum number of attack modes for that race are (each driver has to enable attack mode atleast that many times before the end of the race), duration of each attack mode etc. The halo is lit up with a blue light to indicate the driver has enabled attack mode.
Picture
  • Fan Boost: Fan Boost is yet another "Augmented Power" mode that gives drivers 25kW of added power for 5 secs when enabled. Fan Boost as the name suggests is given to the drivers with highest number of votes from fans. The voting is active until 15 mins prior to the start of the race. The halo is lit up with a purple light to indicate the driver is in Fan Boost mode. This 5 sec boost can sometimes end up aiding that crucial last lap over take to win the race, so every bit of power matters!
  • Battery Management: Since the race is not run for a set number of laps, battery management, system temperature management etc. are key strategic components. The leader of the pack decides how many laps are run. If the leader keeps going full power all throughout, the entire pack will have to complete more laps within the 45 mins and vice versa if the leader decides to hold back the pack. In the former strategy, the leader is at risk of running out of charge before the end of the race (which happened in Mexico E-Prix 2019), and the latter puts the leader at a higher risk of an overtake from the back. In the race this Sunday, the race leader Antonio Felix DaCosta (Team: DS TECHEETAH, reigning team champions) had battery temperature issues close to the last lap, enabling BMW i Andretti's Maximilian Günther, the youngest driver on the grid, to make the crucial last minute overtake and win the race.
    • The battery management also brings the spotlight to a crucial battery conservation technique called "Lifting and Coasting". Unlike F1, battery conservation is key in FE, meaning drivers cannot go full power for the entire straight and brake hard while approaching corners. Instead, drivers have to lift of the power when the can, and coast into the corners, to save energy. This means braking zones are vastly different compared to F1 cars, and this adds a whole new strategic component to the race that is solely controlled by the driver.
All that being said, here are my top reasons for watching FE:
  1. Every race is as unpredictable as it gets, with extremely high last minute drama and nose to tail racing. (that's precisely what you want to see!)
  2. The cars look, sound and perform like something out of Batman's caves!
  3. The last season saw the first 8 races have 8 different winners, and 8 different teams take the trophy. When more than half the number of races have different winners, you can tell how evenly matched the teams and drivers are and how tough and exciting the competition will be!
  4. They keep coming up with new innovations like the new "Driver's Eye View" (Looks like something straight out of a VR game!!) introduced this season!
So, do give it a shot! Their Instagram page, Fia Formula E, is super active, so finding information takes just the touch of a button (more than just one, but you get it!)! You can also find all race highlights and some amazing compilations on their official YouTube page!
Picture
​Let me know in the comments if you have any questions about this series!

0 Comments

    Author

    Random chinwagger who is as ambiverted as is sporadic

    Archives

    April 2020
    February 2020
    January 2020

    Categories

    All
    Deep Learning
    Motorsports
    Pratik's Kitchen

    RSS Feed

powered by your device's battery (I know I will regret this when I'm older :P)
  • Home
  • Research Work
  • About
  • PRAkTIKal-ly Speaking
  • Contact
  • Resume