Python uv cheat sheet
Python uv common usage cheat sheet, but doesn't cover all the features.
Python uv common usage cheat sheet, but doesn't cover all the features.
While Putty and Pageant are widely used tools for SSH connections, you can also integrate Pageant with VSCode to establish remote SSH connections without administrative privileges. While the default ssh-agent service offers similar functionality, it's disabled by default and requires administrative access to start if disabled, making Pageant a more flexible alternative. Here's how you can set it up:
Inspired by the official guide, here is my method for installing (or updating) Ollama without root access.
mkdir -p ~/src
cd ~/src
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
mkdir -p ~/opt/ollama
tar -C ~/opt/ollama -xzf ollama-linux-amd64.tgz
Add Ollama to $PATH:
Then you can run ollama
in your terminal:
nohup ollama serve &
ollama -v
ollama list
# Pull the embeddings model for AnythingLLM RAG
ollama pull nomic-embed-text
# Run the smallest deepseek model
ollama run deepseek-r1:5b
To stop the Ollama server:
Use lsof
, fuser
, ss
, pgrep
, pstree
, ps
, htop
, etc. to find process, port and file usages in Linux.
Use spark.jars
to add local ODBC/JDBC drivers to PySpark, and use spark.jars.packages
to add remote ODBC/JDBC drivers, PySpark will download the packages from Maven repository.
For spark-shell
: https://docs.snowflake.com/en/user-guide/spark-connector-install#installing-additional-packages-if-needed
Python is a dynamically typed language, meaning variable types don't require explicit declaration. However, as projects grow in complexity, type annotations become increasingly valuable for code maintainability and clarity.
Type hints (PEP 484) have been a major focus of recent Python releases, and I was particularly intrigued when I heard about Guido van Rossum's work on MyPy at Dropbox, where the team needed robust tooling to migrate their codebase from Python 2 to Python 3.
Today, type hints are essential for modern Python development. They significantly enhance IDE capabilities and AI-powered development tools by providing better code completion, static analysis, and error detection. This mirrors the evolution we've seen with TypeScript's adoption over traditional JavaScript—explicit typing leads to more reliable and maintainable code.
The majority of this post is based on MyPy documentation.
Typed Python vs data science projects
We know that type hints are not very popular among data science projects for some reasons, but we won't discuss them here.
Sometimes, we need to download online videos maybe a Teams recording. Here's a tip from StackOverflow by using ffmpeg. Be sure to check the comments for solutions to errors like "Error opening input: Invalid data found when processing input.". Another solution is to use kylon/Sharedown which is much more faster.
Databricks Connect allows you to connect your favorite IDE (PyCharm, VSCode, etc.) and other custom applications to Databricks compute and run Spark (or non-Spark) code.
This post is not a comprehensive guide on Databricks Connect; rather, it consists of side notes from the Azure Databricks docs. Most of the notes also apply to Databricks on AWS and GCP.
There are two modern ways to generate an Azure OAuth2 access token using Python: one is by using the MSAL library, and the other is by using the Azure Identity library, which is based on the former.
There're also other ways to get the token, like using the requests
or aiohttp
libraries etc. to send a POST request to the Azure OAuth2 token endpoint, but it's not recommended. As the MSAL and Azure Identity libraries are the official libraries provided by Microsoft, they are more secure and easier to use. For e.g. they handle token caching, token refreshing, and token expiration automatically. Furthermore, some of the credential types are difficult (too many code) to be implemented by raw requests
or aiohttp
.
When checking PySpark's source code, find a nice way it uses to add version information to docstrings by a @since() decorator. Here is an example: