Video Generator is a python tool that allows converting a collection of photos from local and online sources to video.
While I was working on some computer vision task related to extracting some information from public surveillance cameras streams, in some stage I wanted to record some videos from a collection of photos URLs so I decided to create this simple script to help with this stage and share it in public as well.
In this article, I am going to show you how we can improve the computing power of simple API script from total overall (6 minutes and 17 seconds) to (1 minute 14 seconds)
I will share with you one of my simple favourite technique that I prefer to use especially when I work on data science tasks such as data visualization, data analysis, code optimization, and big data processing.
Processing a task in a sequential way may take a long time especially when we are talking about a huge amount of data(eg. big inputs)
This technique takes advantage of parallelization capabilities in order to reduce the processing time.
The idea is to divide the data into chunks so that each engine takes care of classifying the entries in their corresponding chunks. Once performed, each engine reads, writes and processes its chunks, each chunk be processed in the same amount of time.
The example I choose to use for this article is Genderize names that consist of 2 alphabetic characters.
Output Analysis Chart
Clone GitHub Repo and follow instructions in Usage section.
Let’s generate all alphabet names that consist of 2 characters(to make the testing process easy)
we can use some Linux Kali penetration testing tool such as crunch $ crunch 2 2 > names.txt so we generate all possible alphabet names with length 2 (676 lines)
then let’s create directories which are needed for splitting process $ mkdir subs/ subs/inputs subs/outputs subs/outputs/parts subs/outputs/all
now we can split out input data, there are many ways to do that but I prefer to use Unix split command  $ split -l 100 -d names.txt ./subs/inputs/ so we split names.txt file into small files, each file consists of 100 lines
now let’s run all processes: ./init.bash after finish use merger.py script to merge all outputs. merging process separated to avoid conflicts behaviours and sorting-save.
A person can take a photo of his dog and write caption hashtag #Tree instead of #Dog 😀 , I thought about how can companies and we improve the searching process such that we get only the real target we want (ex. images that contain items or objects that we need). I don’t want to search for Tree and get Cats in results instead 🙂 and vice versa.
First thing comes to my head is Object Detection and Recognition techniques which fit perfectly in such cases.
Here is a simple example of how ML can cause amazing improvements. Think from Marketing perspective. Let’s say we want to target some type of audience based on their photos on social media(perfect example here maybe Instagram) to contact with posts authors or announce about our new sales!!!
Real example: we have animals(Cats) food shop and we want to target Instagram users who have an animal(Cat). In the example below output shows how searching using:
Human Hashtags + ML = 20(cats)/20(total result) -> [100% correct]
Human Hashtags Only = 12(cats) / 20(total result) -> [60% correct]
2 SDE Amazon interviews invitations in 1 week, a new experience!
I used to get rejections from Amazon at CV monitoring stage :’( but I never give up! I also used to feel this low energy after finish contests ex. interesting codeforces rounds, although sometimes in contests I solve problems that are much harder than multi-international companies interviews questions but I have to admit that after these 2 interviews my battery is not low as usual it literally dies instead :’)
Let’s analyze briefly:
Total time [1.5 : 2.5] hours
Total questions topics-based: DP, BT, Recursion, String Manipulation, basic math, build and sort complex ds.
A session for algorithms & data-structures coding questions.
A session for open-ended questions to discuss your solutions and complexity.
A session for reasoning questions(very tricky).
A session for code debugging ability(not hard).
A session for working-style questions(focused on soft skills + psychological dimensions).
A session for a survey.
Tricky corner test cases.
DS coding questions are annoying and need to start with wisely choices and smart ideas from the beginning.
Open-ended questions have a very short time, need to think and organize your answer in your mind before the session or during the coding session.
Overall Div2 ~ D level can nail it.
For sure there are other hundreds of questions topics, sessions, and styles but this was my own experience!
If you are still an undergraduate, my advice: “problem-solving” & “practice”