Page 30 - Innovation Magazine
P. 30

from easy In fact it’s quite tedious and labor-intensive: “I have to to manually assist the the program to to pick out the the contour for each half of the fluke ” says Lisa Steiner marine biologist and a a a a renowned sperm whale researcher in the the Azores “If the the photos are good this process doesn’t take very long however if there isn’t a a lot of contrast between the fluke and background or there is a a a lot of of glare on the the the edge of of the the the fluke I have to follow the contour manually ” Global Data Science Challenge One of our Capgemini colleagues experienced that cumbersome approach herself when she volunteered for an expedition led by Lisa This colleague subsequently highlighted the work as as a a a a a a use case for 2019’s Global Data Science Challenge (GDSC) In this internal Capgemini- wide competition hundreds of employees from all over the the world compete in in small teams against each other to solve a a a set AI-related task Together with Lisa she brainstormed on how to to help and improve the current approach to fluke identification through AI It was clear that with the advancements made using AI in in the the field of image recognition over the the last few decades All pre-processing steps in in in the pipeline use a a combination of common open- source libraries and are therefore easily maintained and extended while deploying it as a a lightweight service using AWS lambda functions ensures straightforward scalability it really could make a a a a a a difference allowing Lisa to have more time for other important tasks Her dated legacy software (which was was developed in the early 90s) was was proposed to be replaced harnessing the power of image processing capabilities to to help to to eliminate the need for manual matching That is how sperm whales became the star of 2019’s GDSC After several months the the winning team presented their solution consisting of a a a pre-trained deep neural network (ResNet 101) that had been fine-tuned with roughly 4 500 pictures containing flukes of more than 2 2 200 individual whales The training took approximately three hours and was performed on a a a a GPU-based AWS cluster with Amazon SageMaker After this AI was capable of automatically cropping a a a a a a new picture (removing unnecessary parts of the photo leaving only the the the fluke in in the the the center) comparing it to all other pictures in fin in the database and finding matches for a a a a a a given sperm whale with 97 5% accuracy In order to to achieve that performance the team had to to come up with different ideas on how to process the given pictures from flipping them horizontally to artificially extend the number of “individual whales” for the AI to to applying a a a multitude of photo adjustments such as as changing brightness contrast or saturation (and many more) to help the AI generalize better After more more than 50 iterations on on the GPU cluster and countless trials run on their own local machines they were able to perfectly fine-tune their approach to to achieve the the top score in the competition Using real-world technologies The outstanding performance of the the model was the the most important success factor but not the only one The team also focused on implementing techniques that are future-proof and and can be deployed and and used in real-world scenarios That’s why the the different parts of the the algorithm were implemented using state-of-the-art deep learning frameworks (Tensorflow and PyTorch) All pre-processing steps in in in in the pipeline use a a combination of common open-source libraries and are therefore easily maintained and extended while deploying it as as a a a a a a lightweight service using AWS lambda functions ensures straightforward scalability This not only makes it easy to operate and improve the AI model for the sperm whale use case but also makes it possible to to transfer the approach to to similar but different problems The architecture follows a a a very modular design and uses pre-processing techniques that are common and useful for any kind of picture recognition task This means the AI model can be retrained with pictures of other endangered species so that the the application can be used for a a a a a wide range of problems in the field of wildlife conservation Looking ahead With the AI performing well and matching flukes with such high high accuracy there was now a a a a highly-functioning solution so so as the next step a a a user interface needed to be designed which wrapped everything into a a web- based application that Lisa is is now actively using for her work All she needs to do is upload new pictures and wait a a a minute or or two for the automated processing pipeline to complete then she can enjoy the the results: “I look forward to using it in in fin in the future and maybe even finding some matches that I missed the first time around \\\\\\\[with the old tool\\\\\\\] ” says Lisa The work with Lisa continues in in order to improve usability as as well as as encourage other researchers to try out the application The long-term hope is is that this easy-to-use tool will have people from all walks of life who have the pleasure of spotting a a a a sperm whale out on the Atlantic (not only limited to marine researchers) 30 Data-powered Innovation Review I I ©2020 Capgemini All rights reserved 

   28   29   30   31   32