Building an authenticated Python CLI

β€’ ⏱ 13 min read

When building out the Notia client, we found a real lack of resources around building a persistently authenticated Python library.

To address this, we are going to be building an interactive, authenticated Python CLI that uses the Twitter API to fetch the top Machine Learning tweets of the week! You can see the final result in the video demo above - or you can skip to the final code here.

Building this CLI will let us explore concepts like authenticating a local device between uses, accepting CLI arguments with Click, and displaying our data interactively with Rich.

Twitter API Authentication

The Twitter API offers a few different methods of authentication depending on your use case. We will only be looking to query publicly available information, so the simple OAuth 2.0 authentication scheme is perfect.

The image below from their documentation shows how simple the flow is:

Twitter OAuth 2.0 Flow

All we need to do is provide the Client ID and the Client Secret using Basic authentication to retrieve a Bearer Token. After that, we simply provide the token with each subsequent request.

To get started, let's sign up as a developer on the Twitter Developer Dashboard here. Make sure to note down your Client ID, Client Secret and your app name - you'll need them later.

Project Setup

Next, let's start up a new Python Project with Poetry and our required dependencies. We've called our project Slice of Machine Learning!

poetry init sliceofml poetry add click rich requests-oauthlib

Basic CLI with Click

Click is the defacto standard for building intuitive CLIs in Python. Let's mock a basic CLI that has the interface we want to expose to our users. We can create a file, cli.py, and point Poetry at it in our pyproject.toml:

# pyproject.toml [tool.poetry.scripts] sliceofml = "sliceofml.cli:cli"

Next, let's setup our stub functions to test out how our users will interact with our library.

Our libray will expose just 2 commands:

  • login - Checks if the user is already logged in, and if not prompts them. Additionally, they can also use the --relogin flag to forcefully update their credentials.
  • slice - Fetches the requested time range of tweets from Twitter and displays them.

This simple interface below achieves what are are looking for:

# cli.py @click.group() def cli(): """ Slice of ML or sliceofml is your little 🍰 of ML. """ @cli.command("login") @click.option("--relogin", "-r", is_flag=True) def login(relogin): click.echo(relogin) @cli.command("slice") @click.option("--daily", "frequency", flag_value="daily", default=True) @click.option("--weekly", "frequency", flag_value="weekly") def slice(frequency): click.echo(frequency)

Using poetry shell, let's try out our fancy new commands and see how the interface looks.

Interface

Persistent Authentication

Now that we have our interface defined, let's start building out our authentication functionality. In order to keep our local device authenticated between uses, we need to store our Bearer Token somewhere.

For that, we will use the ~/.netrc file. This pattern has a long history, and is currently used by some popular CLIs such as the Heroku CLI). The netrc file format is not particularly well defined (as excellently explained here), however it will work great for our purposes.

The core of the ~/.netrc format is a simple entry with 3 fields. Let's see how an entry would look for our app:

machine api.twitter.com login <APP_NAME> password <BEARER_TOKEN>

Collecting credentials πŸ”‘

We need to collect the Client ID, Client Secret and App Name from our users and create an entry in the netrc file. We could collect these from the user just using print and input statements, but it would be nice to make our CLI a little more... lively.

For this, let's reach for one of the best Python libraries out there - Rich . Rich is an awesome
library for building TUIs (terminal user interfaces), featuring tons of useful functions and classes to make building UIs easy.

Let's create a new file, display.py, and create a new Display class. This class will abstract over the top of the Rich Console API to create some functions we can reuse throughout our CLI.

You can see our Display class below:

# display.py class Display: def __init__(self) -> None: self._console = Console() def log(self, msg_obj=None) -> None: self._console.print(msg_obj, style="bold green") def log_styled(self, msg_obj=None, style: Optional[str] = None) -> None: self._console.print(msg_obj, style=style) def warning(self, msg_obj=None) -> None: self._console.print(msg_obj, style="bold yellow") def error(self, msg_obj=None) -> None: self._console.print(msg_obj, style="bold red")

This class may seem overkill right now, but we will extend it later to display our tweets.

Let's use this to write a function to prompt users for their Client ID and Secret. The Rich Panel class allows us to create a pretty slick looking prompt. We will pair this prompt with getpass and input to get the required information from our users. We can store all this in a new file: apikey.py.

# apikey.py DEVELOPER_DASHBOARD_URL = "https://developer.twitter.com/en/portal/dashboard" def prompt_api_details() -> Tuple[str, str, str]: api_prompt = Panel( f""" You can find your API keys :key: on your Twitter App Dashboard [blue underline bold][link={DEVELOPER_DASHBOARD_URL}]here[/link][/blue underline bold] """, box=box.ROUNDED, ) display.log_styled(api_prompt, style="yellow") display.log( "Paste the Client ID, Secret and App Name from your profile and hit enter: " ) client_id = getpass.getpass(prompt="Client ID πŸ†” ") client_secret = getpass.getpass(prompt="Client Secret πŸ•΅οΈ ") app_name = input("App Name ✏️ ") return (client_id, client_secret, app_name)

You can see that Rich supports loads of great features we can take advantage of such as easy styling, hyperlinks and more! On line 7 we've styled our output as a link allowing users to navigate straight to the Developer Dashboard.

Now we have our prompt written, let's quickly update our login function to see how it looks:

# cli.py def login(relogin): (client_id, client_secret, app_name) = prompt_api_details() click.echo(f"""πŸ”‘ Your Super Secret Credentials πŸ”‘ Client ID: {client_id} Client Secret: {client_secret} App Name: {app_name}""")

As we can see, our TUI is really coming together! Let's move on to storing the user input.

Login

Fetching our Bearer Token

We've now got our user credentials, but they aren't the final piece of the puzzle. We need to exchange them for a Bearer Token via the Twitter API. Instead of manually creating a POST request to fetch the token, we can leverage the requests_oauthlib library to make the exchange easier.

Let's define a function request_access_token which will take in our client_id and client_secret and return us a fresh Bearer Token:

# apikey.py REQUEST_TOKEN_URL = "https://api.twitter.com/oauth2/token" def request_access_token(client_id: str, client_secret: str) -> str: auth = HTTPBasicAuth(client_id, client_secret) client = BackendApplicationClient(client_id=client_id) oauth = OAuth2Session(client=client) try: token = oauth.fetch_token(token_url=REQUEST_TOKEN_URL, auth=auth) return token["access_token"] except Exception as err: display.error(f"{err}") raise ValueError(err)

Again, let's modify our cli.py to quickly test this out:

# cli.py def login(relogin): (client_id, client_secret, app_name) = prompt_api_details() bearer_token = request_access_token(client_id, client_secret) click.echo(f"Your bearer token is: {bearer_token} ")

Token

As we can see, we have successfully retrieved a new Bearer Token! Let's move on to storing this token in our netrc file.

Writing and reading the netrc file

To store our token in the netrc file, we need some way to create or modify an entry and write it. Unfortunately, the netrc module from the standard library doesn't actually provide the ability to write to the netrc file. Luckily for us, we can take some inspiration from the excellent Weights and Biases client to see how they've written to the netrc file with their write_netrc function.

In addition to just writing to the file, we need a function that can check if an entry already exists (this will prevent the need for a user to login every time). For this we can rely on _find_netrc_api_key again from Weights and Biases.

These functions are quite long, but not terribly complex. For the sake of brevity we've omitted them here, but you can check them out in their full glory here:

Make sure to include these functions in your apikey.py. You'll see how we've used these functions in the following sections.

Tying it all together

Now that we can write and read from the netrc, let's tie it all together in our CLI.

First, we will define a function that uses _find_netrc_api_key and returns our app name and token separately.

# apikey.py def fetch_credentials(api_url: str) -> Tuple[str, str]: agent, token = None, None auth = _find_netrc_api_key(api_url, True) if auth and auth[0] and auth[1]: agent = auth[0] token = auth[1] return (agent, token) else: raise ValueError( f"Could not find entry in netrc file for provided URL: {api_url}" )

And with that, we have all the pieces we need to finish off our login function! Check it out below:

#Β cli.py def login(relogin): apikey_configured = fetch_credentials(TWITTER_API) is not None if relogin: apikey_configured = False if not apikey_configured: (client_id, client_secret, app_name) = prompt_api_details() token = request_access_token(client_id, client_secret) write_netrc(TWITTER_API, app_name, token) else: click.echo("You're already logged in! πŸ”‘")

Our flow looks great! And if we test it out and cat the contents of our netrc file, we can see:

machine api.twitter.com login SliceOfML password AAAAAAAAAAAAAAAAAAAAAM2qegEAAAAAcdvqnZQrt...

Success!

Getting the right Tweets

Now we've finished off our login function, let's dig into the slice command and see how we can fetch the tweets we are looking for. Unfortunately, the V2 Twitter API doesn't offer an easy to use endpoint to fetch popular tweets. However, we are more than capable of building our own. Exploring the Twitter API docs leads us to the handy /tweets/search/recent endpoint, which fetches the last 7 days of tweets.

Let's create a new file, api.py, and start a very basic API class containing a requests client:

# api.py class API: def __init__(self, user_agent: str, bearer_token: str, api_url: str) -> None: self._session = requests.Session() self._api_url = api_url self._request_url = self._api_url + "/2/tweets/search/recent" self._page_size = 100 self._max_pages = 100 self._user_agent = user_agent self._bearer_token = bearer_token self._display = Display() def bearer_oauth(self, r): r.headers["Authorization"] = f"Bearer {self._bearer_token}" r.headers["User-Agent"] = self._user_agent def query(self, frequency: str) -> None: response = self._get_request(self._request_url) print(response.json())

Have a go at plugging this into your CLI function. You'll find that all the tweets returned are pretty irrelevant to us, but it's great to make first contact!

Filtering

Now that we've fetched at least some tweets, let's start honing in on the ones we want. For that, we need to define some good filters. The 'High Quality Filters' tutorial gives a deep dive on tailoring the API, but to summarize, the functionality we are interested in are Tweet Annotations.

Twitter tags each tweet with both Entity Annotations (NER) and Context Annotations. We can use the context_annotations to fetch only ML related tweets. They offer a handy CSV on their GitHub with every context annotation listed. It's as simple as searching the CSV to find 'Machine Learning'. Each entry consists of the domain_id, entity_id and entity_name. We can see that ML falls under the Interests and Hobbies category with a domain_id of 66 and an entity_id of 852262932607926273.

Putting these together, we can now build a new URL in the following format:

https://api.twitter.com/2/tweets/search/recent?query=context%3A66.852262932607926273

Just this would get us all the ML related tweets for the past 7 days, which isn't far off from what we want. However, there is a few more parameters we need to enrich our final 🍰 of ML.

  • tweet.fields=public_metrics - Allows us to fetch the info we will need for sorting our tweets by popularity.
  • expansions=author_id - Enriches our API response with information about the user. We are particularly interested in the username.
  • max_results - By default, the Twitter API caps the number of results to 100, we can use this parameter, in conjunction with the next_token , to fetch multiple pages of results and collect them.
  • start_time - The tweets/search/recent endpoint gives us results for the past 7 days, which means we get our weekly functionality for free. However, if we want to offer the daily option, we also need to provide an RFC3339 formatted timestamp.

Let's define a function, _build_url in our API class that we can use to append all the parameters we are interested in.

# api.py def _build_url(self, next_token: Optional[str], frequency: str) -> str: query_url = f"{self._request_url}?query=context:66.898661583827615744&tweet.fields=public_metrics&max_results={self._page_size}&expansions=author_id" if next_token and len(next_token) > 1: query_url = f"{query_url}&next_token={next_token}" if frequency == "daily": timestamp = datetime.utcnow() + timedelta(days=-1) query_url = f"{query_url}&start_time={timestamp.isoformat('T')}Z" return query_url

We will need to enhance our query function in order to drive the new pagination functionality. We should also extract only the fields we are interested in for forwarding to our final display function. These are:

  • id - Every tweet is assigned a unique ID. This Twitter blog post gives us a handy trick for building a live Tweet URL just from the ID.
  • text - The meat of the tweet!
  • like_count - We will be sorting our tweets by likes as a proxy for popularity.
  • username - We need to build a map of user_id => username in order to correctly match up a tweet to it's author.

Putting this all together, you can see our query function defined below:

# api.py def query(self, frequency: str): page, next_token, user_map, tweets = 0, "", {}, [] while page < self._max_pages and next_token is not None: response = self._get_request(self._build_url(next_token, frequency)) json = response.json() next_token = json["meta"].get("next_token") for user in json["includes"]["users"]: user_map[user["id"]] = user["username"] for tweet in json["data"]: tweets.append(( tweet["id"], tweet["text"], tweet["public_metrics"]["like_count"], user_map.get(tweet["author_id"]))) page += 1 print(tweets) return tweets

Have a go at printing out the fields we extracted before we move on to displaying them.

Displaying our tweets

To display our tweets, we can again lean on Rich and start enhancing our previously overkill Display class. We can use the Table class to display each tweet as a row.

Check out the Table docs for all the different ways you can customize the table. For clarity, we've written a few helper functions to build the Profile and Tweet links. We've also sorted the tweets by like count and taken the top 10, so you only get the best of ML Twitter!

# display.py def buildProfileLink(self, username: str) -> str: return f"[bold blue][link={self.TWITTER_BASE}/{username}]@{username}[/link][/bold blue]" def buildTweetLink(self, _id: str) -> str: return f"[bold blue][link={self.TWITTER_BASE}/twitter/status/{_id}]View Tweet[/link][/bold blue]" def tweetsAsTable(self, tweets: List, frequency: str) -> None: tweets.sort(reverse=True, key=lambda t: t[2]) tweets = tweets[:10] table = Table( show_header=True, box=box.ROUNDED, show_lines=True, padding=(0, 1, 1, 0), border_style="yellow", caption_style="not dim") table.title = f"[not italic]🍰 Your {frequency} Slice of ML 🍰[/not italic]" table.caption = "Made with ❀️ by the team at [bold blue][link=https://notia.ai]Notia[/link][/bold blue]" table.add_column("Username πŸ§‘", justify="center") table.add_column("Tweet 🐦", justify="center", header_style="bold blue", max_width=100) table.add_column("Tweet Link πŸ”—", justify="center") table.add_column("Likes ❀️", justify="center", header_style="bold red") for tweet in tweets: table.add_row( self.buildProfileLink(tweet[3]), tweet[1], self.buildTweetLink(tweet[0]), str(tweet[2])) self._console.print(table)

Finally connect all this up to our slice command in cli.py like so:

# cli.py def slice(frequency): credentials = read_credentials(TWITTER_API) tweets = API(credentials[0], credentials[1], TWITTER_API).query(frequency) Display().tweetsAsTable(tweets, frequency)

And there you have it! Your very own authenticated, interactive Python CLI. Check out the Github for an up to date version of the code, or just pip install sliceofml to get your daily slice of Machine Learning!

Written by Christopher Fleetwood