clark | rensberry – professional | coder | volunteer

Hone – A time to sharpen skills/knowledge – the latest article:

Gen AI – Google’s Gemini
by Clark Rensberry
April 20, 2025

So among other things, I am a Kaggler – not sure what that means, but I spent some time on some jupyter notebooks and well – an email came in one day about a free course from Google – and I took the bait.

https://rsvp.withgoogle.com/events/google-generative-ai-intensive_2025q1/home

After 5 days – there was an optional capstone project to take on for the 1/4 million (yes that’s 250,000) participants to demonstrate what they had learned. (Let us pause here for a moment and consider how much energy was consumed by the google servers to teach that many people about Generative AI – there were little meters on the notebooks that showed CPU usage and whatnot – so I can’t imagine that someone at google hasn’t tracked that. It would be interesting how much energy it was – and relate it to something – like tanks of gas in an SUV – or seconds of travel for an aircraft carrier or whatnot.)

Anyway.. I couldn’t think of anything sexy – but I did have a little pet project where I had some pdf’s that I’d have to manually extract information from for a little database – or a “library” as my fellow-volunteers call it. Below is my story of that operational pipeline I build using google’s Gemini model with Kaggle’s notebook code web ui. Amazing!

PDF’s in need of summarization

I had a situation where I have a library of PDF’s that require specific information extracted from them and was going to manually read each one to extract that information. When I took this course I realized that I could use the Gemini generative models to “read” the pdf’s – extract the information I would have had to manually read and type into a database – and give it to me.

The Start

I used the examples from the first two days to test all the pieces (ie read a PDF and summarize it – then progress to crafting a prompt that would extract the specific data I wished. I noted that even when I had the temperature set to 0.0, it varied from time to time with certain types of data I wished – particularly the examples given i the course that just asked for a summarization of a PDF certainly did vary. (For this notebook I used PDF’s referenced in various lab examples throughout the Gen AI Intensive course.)

Going from an example to an ‘ops’ pipeline

There was quite a bit of difficulty moving from a one-off to a pipeline for an “ops” situation.

API drives and feeds the pipeline

First, I crafted an api with a simple JSON array of objects with a url property and crafted a the notebook code to look at each url in a loop. https://pacp.ca/kaggi/api/

Loopy loops loop lovingly

Then, after much trial an error (and was assisted by Gemini AND ChatGpt) I was able to put that in that loop code that would ask the model to find and “generate” the information that was needed for each PDF.
I added some code to check to make sure the URL was valid and it was truly a PDF.
As the loop does it’s job – the results are displayed for the user – a nice thing for debugging.
But the key was storing that information in an array and at the end – sending that information back to the api with the results – completing the loop.

API receives the results

I stored the results in a separate file – and they can be inspected by calling the api here: https://pacp.ca/kaggi/api/postindex.php

If you run this notebook again – you’ll see two new entries in it.

Extra notes – ethics

Calling an external API from an LLM is not something the my particular API wanted to do – it smelled too much like a “bot” I suppose.

Ironically, when I consulted with Gemini and ChatGpt – I got advice on how to “workaround” by pretending to be a browser. Of course there was nothing about this being deceptive and a potential abuse – but since it was my own API and I had permission – I proceeded with the hackaround and it worked. This means that I definitely need to have rate limiting and other “measures” to prevent abuse – although CORS theoretically should solve it – I turned that shtuff off for this proof of concept.

Also – one LLM gave me advice I thought was really good – and I randomized a pause on the loop – so that I would “mimic natural usage” whatever THAT means!

Summary

This simple little pipeline of abilities: the GenAI summary ability, the ability to read PDF’s, the ability to read and write JSON to api’s – is a handy little tool that I’m hoping will save me many minutes of manual labour. Wish me luck!

“Hone” your skills on previous articles…

PDF’s in need of summarization

The Start

Going from an example to an ‘ops’ pipeline

API drives and feeds the pipeline

Loopy loops loop lovingly

API receives the results

Extra notes – ethics

Summary