<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><title>frankie-tales</title><id>https://lovergine.com/feeds/tags/computing.xml</id><subtitle>Tag: computing</subtitle><updated>2026-06-14T07:00:01Z</updated><link href="https://lovergine.com/feeds/tags/computing.xml" rel="self" /><link href="https://lovergine.com" /><entry><title>The only good AI is a dead one: myth and reality of AI tooling for code</title><id>https://lovergine.com/the-only-good-ai-is-a-dead-one-myth-and-reality-of-ai-tooling-for-code.html</id><author><name>Francesco P. Lovergine</name><email>mbox@lovergine.com</email></author><updated>2026-05-05T13:00:00Z</updated><link href="https://lovergine.com/the-only-good-ai-is-a-dead-one-myth-and-reality-of-ai-tooling-for-code.html" rel="alternate" /><content type="html">&lt;p&gt;Some months ago, I participated in &lt;a href=&quot;https://floss.social/@sjn@chaos.social/116062509535203478&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;this Mastodon poll&lt;/a&gt;.
My thesis was that, in general, there is no presumptive quality flag that can be
added to a product, whether it was created with AI-aided tools or not. My
opinion sparked a small, heated controversy among some readers.&lt;/p&gt;&lt;p&gt;Labeling a product as AI-generated and expecting low-quality results is
simplistic and prejudiced. One expects that the resulting product is accepted
without review or handcrafted changes, after multiple adjustments to the LLM
prompt through an iterative process. This is clearly false in general for coding
(vibe coding, &lt;em&gt;per se&lt;/em&gt;, is not a black-and-white picture; it is full of shades
of grey), but it can also be applied to other creations.&lt;/p&gt;&lt;p&gt;Even the use of AI tooling has hundreds of nuances, especially in many
articulated processes of creative production across multiple fields. AI tools or
not, the human contribution is still central in the process. One can abdicate to
such a role, or be fully responsible for the final result, with accurate changes
and handcrafted work based on a rough draft generated by AI. The presumption of
labeling AI helper use as a signal of low-quality production negates the
importance of the human-in-the-loop. The point is that quality should be
considered an objective aspect of any product, with or without human authoring:
having a handcrafted product is not necessarily a symptom of quality &lt;em&gt;per se&lt;/em&gt;, and
the same goes for the opposite if the product used AI tooling at any phase of
its creation. At least for programming, we had spaghetti code for ages, well
before AI agents.&lt;/p&gt;&lt;p&gt;If you take a walk around GitHub and look at too many projects to enumerate,
even without any AI intervention, you will find a lot of half-finished,
incomplete, alpha-quality, obsolete, or partially working code that would need a
good number of deep refactorings to be considered for production use. That’s not
a problem with AI use; it’s simply due to the not-too-recent shift in FOSS
coding as the mainstream approach to writing programs. Opening a GitHub
portfolio assumed almost the same importance as opening a LinkedIn profile for
techies. In many cases, such proof-of-concept products have been sitting on a
developer’s shelf for years.  Today, they populate their GitHub (or any other
hub) repos, instead.&lt;/p&gt;&lt;p&gt;On the opposite side, one could read &lt;a href=&quot;https://antirez.com/news/164&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;a recent Antirez’s
post&lt;/a&gt;, which shows how much a meticulous
human-in-the-loop approach for automatic programming can be productive and
reasonable. Not secondary; such an approach should be currently considered
essential to fully admit a copyrightable contribution to existing or new code,
as explained in &lt;a href=&quot;/the-artificial-author-copyright-and-copyleft-in-the-ai-era.html&quot;&gt;Simone Aliprandi’s recent
book&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In conclusion, pretending that AI tooling produces only slop is clearly
prejudicial, not different from past anti-FOSS prejudice that unpaid work done
in free time cannot be of good quality. Does it sound familiar? Sure, there are
a lot of low-quality projects out there, but again, that’s not necessarily the
case, and there are also many very good FOSS projects, created and maintained
with great care. Again, a license type and development workflow are not enough
metrics to judge software quality, nor is the use of AI sufficient for the same
purpose. And that’s true for coding as for other creative works. That’s not that
simplistic, dudes.&lt;/p&gt;&lt;p&gt;Guess what? 97% of 3,456 respondents in the Mastodon poll answered that an AI
mark on a product is a good idea for a presumptively low-quality design. I’m
quite sure most of them are also AI users in some form, of course, and that says
a lot about the future of this AI era and human ingenuity.&lt;/p&gt;</content></entry><entry><title>SM-Tools, Copernicus, FOSS and the reasons of inevitable choices and drifts</title><id>https://lovergine.com/sm-tools-copernicus-foss-and-the-reasons-of-inevitable-choices-and-drifts.html</id><author><name>Francesco P. Lovergine</name><email>mbox@lovergine.com</email></author><updated>2026-02-25T15:00:00Z</updated><link href="https://lovergine.com/sm-tools-copernicus-foss-and-the-reasons-of-inevitable-choices-and-drifts.html" rel="alternate" /><content type="html">&lt;p&gt;Here at work, we develop a series of tools for geospatial processing with
multiple goals, expected maintenance durations, different scopes, generalization
needs, and motivations. Note that about 10 years ago, here in Europe, a
completely new approach to upstream and downstream services for Earth
Observation began: ESA changed its data licenses, and distribution and access
modalities entered the Big Data era.&lt;/p&gt;&lt;p&gt;That even changed the data approach for academic institutions and triggered a
major shift in the daily work of researchers, with access to a tremendous volume
of weekly data available almost just in time and worldwide. Of course, such a
change also impacted us, and we had to adapt our processing and storage
capabilities to the new era.&lt;/p&gt;&lt;p&gt;One of my side projects in that regard is
&lt;a href=&quot;https://baltig.cnr.it/francesco.lovergine/sm-tools&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;SM-Tools&lt;/a&gt;, which consists
primarily of a collection of support tools for running our internally developed
soil moisture algorithm using SAR satellite data (&lt;a href=&quot;https://sarwater.irea.cnr.it/smosar.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;SMOSAR&lt;/a&gt;).
One such tool (now named &lt;code&gt;smt_copernicus&lt;/code&gt;) began
(and evolved with multiple restarts-from-scratch) more than 10 years ago, when
the Sentinel constellation started operations. Its purpose is to search for and
download satellite products from Copernicus archives using multiple criteria,
and to maintain an internal geospatial database of these products, along with
all derived maps and ancillary data. This is only one component of a system that
should be able to process large quantities of multi-source data on selected
areas of interest, create downstream products, and calibrate and analyze results
by comparing them with field data. The clear final goal is to achieve new
findings in satellite data analysis, supported by extended processing worldwide,
and to introduce new algorithms.&lt;/p&gt;&lt;p&gt;This is a long-term goal that unfortunately runs up against short- to mid-term
difficulties of accessing archives that are not under our direct control. The
sad reality is that in the last 4 years,  the Copernicus archive access modality
changed 3 times, and in the previous period, Copernicus also changed policies
and modalities in progress (e.g., by introducing online and offline products,
changing formats, etc.). Geospatial communities are small enough to encounter
more practical difficulties than expected in such operational conditions, and
this is now an almost weekly experience. We now have to chase other parties’
changes more often than we did in the past, rather than working on our own goals
instead.&lt;/p&gt;&lt;p&gt;For instance, until 2023, the main package used for accessing the Copernicus
archive was the &lt;a href=&quot;https://github.com/sentinelsat/sentinelsat&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Sentinelsat&lt;/a&gt;
Python package (developed by an handfull of willing
scholars starting in the summer of 2015). It became abandonware that year when
people discovered that the protocol changes required rewriting most of the
package from scratch, including all test code and mockings. That happened again
at the end of last year. Incredibly enough, all access protocols have been an
ongoing affair since 2023, even for non-secondary details, and required frequent
adjustments to avoid unexpected breakage in download and processing pipelines.
That for sure does not encourage community-supported FOSS solutions. Exactly in
2023, when I discovered our Sentinel-related tool had to be deeply changed
because of the lack of the obsoleted Sentinelsat package, I decided that enough
was enough and managed to migrate from Python to a fully self-supported Perl
reimplementation: one of the things I always hated in the Python ecosystem is
the excessive (for me and our purposes at least) speed in deprecation of
consolidated packages and features, and the prospective of having to chase after
unexpected changes in both Copernicus AND Python sides was out of question. My
experiences with Perl have been much less annoying in this regard, with scripts
running perfectly even after 20 years since they were written. Let’s consider
this the old-school approach: if something is working, don't touch it without a
more than valid reason, and even then, think twice before you touch.&lt;/p&gt;&lt;p&gt;In the meantime, around 2019, another independent effort started to support
Copernicus access, along with a few other data providers. That’s about  4 years
after the original Sentinelsat project. Timing in this case is essential to be
considered. &lt;a href=&quot;https://github.com/CS-SI/eodag&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;EODag&lt;/a&gt;
is a single-company FOSS product that has been actively
developed since then, but may have been considered stable 1-2 years later. Of
course, it only provides the usual access layer, and using that package implied,
at the time, replacing Sentinelsat with EODag as a base layer for searching and
downloading only, while performing other tasks afterward with other self-made
code. Before 2023, using Sentinelsat or EOdag was equivalent in order to perform
the same task, with very little advantages.&lt;/p&gt;&lt;p&gt;Note that both tools were, in any case, pretty adequate but not enough for our
goals, and also had a few defects (or a lack of flexibility, to say better) to
manage in some creative way. That has been one of the reasons for not replacing
Sentinelsat with EOdag in 2023. The other major one was that the idea of
replacing a small package with another (as for Sentinelsat, there are just a
couple of main contributors to the codebase, along with a good number of
pending issues and PRs for such a kind of product) is probably not the safest
one to avoid problems in the near future, when Copernicus will change things
(again, see the next Earth Observation Processing Framework EOPF data format —
Zarr). And of course, EODag is written in Python, and I already expressed my
concerns about that.&lt;/p&gt;&lt;p&gt;Even if you like it or not, nowadays, the concrete alternative to adopting small
FOSS projects to perform basic tasks is to use AI tooling to create a perfectly
(or almost so) tailored implementation for the target task. While in 2023 I had
to rewrite from scratch (in maybe a month of work with some fixes to
&lt;a href=&quot;https://metacpan.org/dist/Geo-GDAL-FFI&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Geo::GDAL::FFI package&lt;/a&gt;
for a multi-threading issue) an implementation of a
multi-threaded tool for accessing the Copernicus archive and maintaining an
internal geospatial database of products consistently among multiple
generations of the archive, I implemented the STAC protocol variant in a few
hours instead, thanks to Claude Code-based patching and my reviewing and tests
for the resulting codebase. As said in &lt;a href=&quot;/is-ai-driven-coding-the-start-of-the-end-of-mainstream-foss&quot;&gt;this
post&lt;/a&gt;, currently,
the cons of adopting a small third-party FOSS solution outweigh the pros,
particularly regarding the resulting technical debt, compared with a
well-conducted self-consistent AI-based development process.&lt;/p&gt;&lt;p&gt;I’m seeing in my own side projects exactly the mirror of what will probably be
the reality of FOSS projects in the near future, in practice, as I mentioned in
the previous post. Relatively few major/interesting projects will be adopted by
others and attract contributions, while most codebases will become pure one-man
shows, with AI tooling.&lt;/p&gt;&lt;p&gt;A significant part of geospatial processing involves data procurement and
processing, i.e., refining and preparing data and images in order to collect,
filter, and process large volumes of data for subsequent analysis. This is the
most annoying and repetitive part of the process, and also often the most
time-consuming. In my experience, working on those tasks is probably the most
effective way to use LLMs through a spec-and-test-driven design. Whether you
like it or not, it is the most immediate way to produce working code by
iterating with a chain of thoughts and accurate reviewing of results, including
a decent test coverage. As observed in &lt;a href=&quot;https://antirez.com/news/159&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Antirez's
experience&lt;/a&gt;, the AI agent also had the nice
ability to retain my Perl style (which is not secondary, given that Perl has
many programming flavors and variants).&lt;/p&gt;&lt;p&gt;Maybe the final results will be an increase in quick-and-dirty codebases, but
for many scholars, it will be a major simplification of their lives. In most
domains of science, coding activities have been seen as an inevitable evil: they
are a tool, not the primary goal, and even before the advent of LLMs, most
scientific codebases were far from something to be proud of. The Copernicus
attitude to FAFO will also encourage such approaches. Simply because scholars
don't have time to waste chasing changes introduced by this or that data
provider or company when contracts change.&lt;/p&gt;&lt;p&gt;AI sloping attitude? No, simple survival instinct.&lt;/p&gt;</content></entry><entry><title>A decent SSH client for Android is not what one would expect</title><id>https://lovergine.com/a-decent-ssh-client-for-android-is-not-what-one-would-expect.html</id><author><name>Francesco P. Lovergine</name><email>mbox@lovergine.com</email></author><updated>2026-02-08T16:00:00Z</updated><link href="https://lovergine.com/a-decent-ssh-client-for-android-is-not-what-one-would-expect.html" rel="alternate" /><content type="html">&lt;p&gt;I’m not too happy with using mobile devices as a daily driver for server
connections. When one needs to use a keyboard, in many cases the appropriate
device is inevitably a laptop or a desktop computer. Anyway, sometimes it
happens  that the mobile phone should be used as an SSH client for an emergency
or to perform simple remote tasks. Possibly using an app that is usable, decently
supported, and can share common configurations among multiple devices with a
reasonable level of security.&lt;/p&gt;&lt;p&gt;For years, the most used client on Android has been &lt;em&gt;JuiceSSH&lt;/em&gt;, but unfortunately,
it became abandonware some years ago, and at the end of last year, it was also
delisted from the PlayStore. A few days ago, I changed my smartphone and
finally moved to the latest &lt;a href=&quot;https://shop.fairphone.com/home&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Fairphone&lt;/a&gt; model.
Now, who knows me already knows how
much I hate mobile phones, which are generally the realm of apps with the worst
UX ever conceived. Since then, I discovered that the pro option of JuiceSSH is
also dead, and basically, SSH forwarding cannot be used anymore. Too bad, I
decided to look for a state-of-the-art SSH client application and discovered
that the generally suggested apps (i.e., Termius and Connectbot) are the usual
PITA.&lt;/p&gt;&lt;p&gt;Thanks gosh, &lt;em&gt;Termux&lt;/em&gt; entered the room.&lt;/p&gt;&lt;p&gt;For people who don't know it, it is an Android terminal that emulates a color
xterm, but has some nice features, specifically a damn good
&lt;a href=&quot;https://wiki.termux.com/wiki/Package_Management&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;package management tool&lt;/a&gt;
built in. Of course, it is pure FOSS.&lt;/p&gt;&lt;pre&gt;&lt;code&gt;pkg install openssh git vim&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now it could be nice to access the common storage area by enabling it with&lt;/p&gt;&lt;pre&gt;&lt;code&gt;termux-setup-storage&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This could be useful for exchanging files with remote hosts and keeping them
available for/from other apps.&lt;/p&gt;&lt;p&gt;Now, a useful &lt;code&gt;.ssh/config&lt;/code&gt; file can be pulled via &lt;code&gt;git&lt;/code&gt;, stored locally, and
possibly customized to simplify ssh network access, with all helpful host
stanzas. Even a local password-protected host key can be created and stored
locally for safety.&lt;/p&gt;&lt;p&gt;So far, so good. Probably, I should simply start giving up on treating Android
as a special beast and treat it as just another Linux host. Fewer apps, more
terminal, and fuck the majority!&lt;/p&gt;</content></entry><entry><title>About computing environments for reproducible science</title><id>https://lovergine.com/about-computing-environments-for-reproducible-science.html</id><author><name>Francesco P. Lovergine</name><email>mbox@lovergine.com</email></author><updated>2025-12-09T13:00:00Z</updated><link href="https://lovergine.com/about-computing-environments-for-reproducible-science.html" rel="alternate" /><content type="html">&lt;p&gt;A few weeks ago I gave a lecture for the &lt;a href=&quot;https://spatial-ecology.net/course-geocomputation-machine-learning-for-environmental-applications-intermediate-level-2025/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Spatial Ecology
course&lt;/a&gt;
to introduce a handful of junior and not-so-junior researchers from various
domains to the not-so-nice world of scientific computing environments.&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;/images/galileo.png&quot; alt=&quot;Poor Galileo working on modern computer&quot; /&gt;&lt;/p&gt;&lt;p&gt;For people interested,
&lt;a href=&quot;https://spatial-ecology.net/docs/source/lectures/lect_20252511_dependency_management_in_data_science.pdf&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;here&lt;/a&gt;
are my slides about this topic. They are somehow specialized for the Python
ecosystem (which has become nowadays the programming language adopted for
scientific computing in multiple contexts), where, in the last few years, a lot
of evolutions have taken place for the management of dependencies and the
management of the computing environment. This problem is amplified in the HPC
context (I already wrote &lt;a href=&quot;/does-hpc-mean-high-pain-computing.html&quot;&gt;a semi-serious post&lt;/a&gt; about such an argument).&lt;/p&gt;&lt;p&gt;I also cited &lt;a href=&quot;https://guix.gnu.org/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;&lt;em&gt;guix&lt;/em&gt;&lt;/a&gt; without more details (it was impossible to deal with all
sub-topics in the lecture, and I know that multiple listeners already had
problems fully understanding the matter).&lt;/p&gt;&lt;p&gt;Reasoning about that, it is not a silly idea to write some blog notes about the
whole topic. First of all, what is the context? &lt;a href=&quot;https://pmc.ncbi.nlm.nih.gov/articles/PMC2981311/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Reproducible science&lt;/a&gt;
is not a novel matter. Any scientific experiment should be reproducible, starting from the same
data and giving comparable results: this is the basis of the &lt;a href=&quot;https://en.wikipedia.org/wiki/Scientific_method&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;scientific method&lt;/a&gt;
(Galileo docet). In the
context of scientific computing, that implies that the whole execution
environment should be fully reproducible in order to ensure the possibility of
replication of the executions, with the same outputs starting from the same
inputs. Possibly later on, running on the same platform, or after deployment
on a new, completely different system.&lt;/p&gt;&lt;p&gt;The key point is that the long-term reproducibility of such results on current
platforms and with current languages is minimal, to be generous.  Having the
full source code of a Python notebook, a git source repository, or anything
comparable is simply only the starting point. The sad reality is that in
practice, the source code has, in too many cases, a lifetime of a few months
because of the understimation of such a problem by the average scholar. When
following a few good practices, such a lifetime can be extended to a few years,
maybe.&lt;/p&gt;&lt;p&gt;When I wrote my thesis, too many years ago, I developed the whole C source for
execution on a parallel computer of the time. It was a &lt;a href=&quot;https://en.wikipedia.org/wiki/Meiko_Scientific&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Meiko Computing Surface&lt;/a&gt;,
a SIMD platform based on &lt;a href=&quot;https://en.wikipedia.org/wiki/Transputer&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;INMOS Transputers&lt;/a&gt;. The C code
used a proprietary message-passing library, CSTools, to enable communication
among T-800 processors (unfortunately, there is no relation with the Terminator
series, sorry). Now, it is somewhat expected that a code based on a dead
proprietary library running on a dead hardware platform could have
reproducibility issues today, after more than 30 years.&lt;/p&gt;&lt;p&gt;What is unexpected is that one could have the same reproducibility problems
after 30 months, or in some limited cases, after 30 days. I mean both at the
binary and source levels, often. Now, part of the problem is due to the &lt;a href=&quot;https://www.merriam-webster.com/slang/fafo&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;FAFO&lt;/a&gt;
attitude of some development communities. Not all teams are like the GDAL
one, which is capable of maintaining the same well-refined APIs for dozens of
years. More usually, from time to time, new versions of libraries and tools
introduce expected or unexpected breakages against past versions and APIs, which
backfire on programs that use them. In other cases, new versions can fix and/or
introduce bugs of primary interest for dependent software. Those are the main
reasons to meticulously annotate and document every single version of direct and
indirect dependencies. This is somewhat solved by dependency resolvers, as
explained in my lecture. But that's only part of the whole chain.&lt;/p&gt;&lt;p&gt;Unfortunately, nowadays this chain of dependencies traverses a single language
and crashes against system-level dependencies, including the whole operating
system, with its system compilers, interpreters, and libraries. This problem is
amplified in a fully containerized world, which is nowadays used intensively.
Depending on a third-party-provided binary image taken from any hub out there is
not a safer approach. Such images can disappear from night to day or have a
limited lifetime, so the conscious scholar should also develop his/her own from
scratch, which often is particularly out of the skill perimeter of the average
scholar.&lt;/p&gt;&lt;p&gt;This is exactly where Guix tries to give an answer. Guix is a source-level
package manager with a set of full descriptions written in Guile Scheme for the
whole chain of dependencies up to the kernel level. Combining such an analytical
description of the system for any built artifact in the timeline from the
starting point (derivations), along with the possibility to use build systems to
cache binary artifacts (substitutes), and install any software at the user
level, does allow the creation of a source-level definition of a full execution
environment.&lt;/p&gt;&lt;p&gt;Such an ambitious goal is not without problems, as magistrally summarized by
Ludovic Courtès
&lt;a href=&quot;https://hpc.guix.info/blog/2024/03/adventures-on-the-quest-for-long-term-reproducible-deployment/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;here&lt;/a&gt;,
but anyway it is at light-years of distance from the possibility of the average
deployment system that needs instead continuous babysitting in order to ensure a
working environment.&lt;/p&gt;&lt;p&gt;What is probably of interest for general consumption would also be a consistent additional
security tagging for derivations in order to fast identify sources with known
CVE-tagged versions in the chain of dependencies. That would increase the level
of self-awareness when the Guix time machine is used to go back in the past and
pick some sources from Pandora's box. It would also be of considerable interest
in the &lt;a href=&quot;https://en.wikipedia.org/wiki/Software_supply_chain&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;SBOM&lt;/a&gt; context
outside the perimeter of science computing.&lt;/p&gt;&lt;p&gt;So, guix is not perfect, but again a sure advancement towards reproducible computing
environments, which currently lack in a way or another in the science domain
(and not only that).&lt;/p&gt;&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;&lt;p&gt;[1] &lt;a href=&quot;https://hpc.guix.info/blog/2023/06/a-guide-to-reproducible-research-papers/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;A guide to reproducible research papers&lt;/a&gt;&lt;/p&gt;&lt;p&gt;[2] &lt;a href=&quot;https://zenodo.org/records/7088068&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Guix as a tool for reproducible science&lt;/a&gt;&lt;/p&gt;&lt;p&gt;[3] &lt;a href=&quot;https://inria.hal.science/hal-04776900/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Using Guix for managing reproducible, flexible, and collaborative environments in a PhD thesis&lt;/a&gt;&lt;/p&gt;&lt;p&gt;[4] &lt;a href=&quot;https://doi.org/10.1101/29865o3&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Reproducible genomics analysis pipelines with GNU Guix&lt;/a&gt;&lt;/p&gt;&lt;p&gt;[5] &lt;a href=&quot;https://en.wikipedia.org/wiki/Replication_crisis&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Replication crisis&lt;/a&gt;&lt;/p&gt;</content></entry><entry><title>A call to minimalistic programming</title><id>https://lovergine.com/a-call-to-minimalistic-programming.html</id><author><name>Francesco P. Lovergine</name><email>mbox@lovergine.com</email></author><updated>2025-09-10T17:00:00Z</updated><link href="https://lovergine.com/a-call-to-minimalistic-programming.html" rel="alternate" /><content type="html">&lt;p&gt;Minimalism in development is a forgotten virtue of our time that should gain
more attention. A straightforward summary of some minimalism principles is
available &lt;a href=&quot;http://minifesto.org/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;here&lt;/a&gt;. Briefly, the principles of minimalism
in Software Engineering can be summarized as follows, based on the manifesto for
minimalism.&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;em&gt;Fight for Pareto's law&lt;/em&gt;: look for the 20% of effort that will yield 80% of the results.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Prioritize&lt;/em&gt;: minimalism isn't about not doing things but about focusing first on the important.&lt;/li&gt;&lt;li&gt;&lt;em&gt;The perfect is the enemy of the good&lt;/em&gt;: first do it, then do it right, then do it better.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Kill the baby&lt;/em&gt;: don't be afraid of starting all over again. Fail soon, learn fast.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Add value&lt;/em&gt;: continuously consider how you can support your team and enhance your position in that field or skill.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Basics, first&lt;/em&gt;: always follow top-down thinking, starting with the best practices of computer science.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Think differently&lt;/em&gt;: simple is more complicated than complex, which means you'll need to use your creativity.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Synthesis is the key to communication&lt;/em&gt;: we have to write code for humans, not machines.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Keep it plain&lt;/em&gt;: try to keep your designs with a few layers of indirection.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Clean kipple and redundancy&lt;/em&gt;: minimalism is all about removing distractions.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Most of those principles are coherent with each other and relate heavily to the
well-known Unix &lt;a href=&quot;https://en.wikipedia.org/wiki/KISS_principle&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;KISS principle&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;An extended and fascinating book about the practical application of such
principles is Eric S. Raymond's &lt;a href=&quot;http://www.catb.org/~esr/writings/taoup/html/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;&lt;em&gt;&amp;quot;The Art of Unix Programming&amp;quot;&lt;/em&gt;&lt;/a&gt;, which I
strongly recommend reading. I can also recommend a now-classic volume on the
same topic by John Ousterhout, &lt;a href=&quot;https://web.stanford.edu/~ouster/cgi-bin/book.php&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;&lt;em&gt;&amp;quot;A Philosophy of Software Design&amp;quot;&lt;/em&gt;&lt;/a&gt;. Both outline
practical examples of how minimalism in design can be effectively embraced, with
a focus on doing the right thing sooner rather than later.&lt;/p&gt;&lt;p&gt;The same principles could (or maybe should) be applied even to programming
languages, but this is often a neglected aspect of such a minimalistic approach.
Note that one of the most successful languages of all time is the C language,
which indeed has a straightforward syntax and, as such, cannot be easy to use
correctly (the principle is that what is simple is not necessarily easy, too).
That's because the programmer needs to create her/his own abstractions and
layers to build her/his vision of a software design. It seems that this is
precisely the opposite of the C++ or Java approach, where the entire
specification spans thousands of pages, and many high-level abstractions are
integral parts of the language. The same can be applied to Python nowadays,
which started as a simple language, more readable and clean than Perl, but now
has a wide and articulated specification. Again, hundreds of pages are now
needed to describe a once-simple language, where tons of new features and
abstractions have been added to enrich its expressiveness.  If one considers its
standard libraries and modules, the actual situation appears even worse.  Can
such an approach be considered &lt;em&gt;easier&lt;/em&gt;? I don't think so. Let me say: how can a
program be considered simple if it relies on hundreds (or even thousands,
including dependencies recursively) of external modules, as well as hundreds
of syntactical constructs and glues?  Some languages also
manage multi-versioned dependencies, allowing a program to cross-depend on
multiple editions of the same module (yes, JavaScript, I'm talking about you),
with the concrete possibility of introducing obscure bugs as a result. At the
opposite extreme, there is the consideration that we only know and deeply
understand what we make.&lt;/p&gt;&lt;p&gt;Minimalism also means actively seeking a balance between these two opposing
approaches, because reusing third-party modules and packages can be an immediate
solution to deadline urgencies, but can also potentially introduce instability
and dependencies on unmaintained software in the long run.
Long dependency chains where changes happen independently of the main program
focus and are introduced by third-party motivations and reasons - often with wrong
timing for depending projects - can cause breakages at multiple levels.&lt;/p&gt;&lt;p&gt;Of course, to reach
the right tradeoff, a few things need to be considered: every single programmer
could not be smarter than a lot of libraries and modules out there, where
multiple developers could have spent hours/weeks/months, or even years refining
them. That's true, but it is also true that not all libraries or modules are
written with the same level of quality and effort. For instance, we all know
cases of elementary modules available for Node that could be easily avoided, and
instead are imported for some form of laziness in development.  Even, sometimes
features that need to be used could be only a small portion of the whole
library/module, which could be reimplemented with a very reasonable effort and
time. This approach could be amplified in modern times when AI tools could
significantly increase productivity in such cases. I would simplify these
concepts with some additional mottos:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;em&gt;Limit your external dependencies&lt;/em&gt;: avoid depending on modules or libraries
that are not strictly required to significantly reduce the total development
time, are not rock stable for their interfaces and features, and do not have
a clear and stabilized roadmap.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Reproducibility of the software stack is a must&lt;/em&gt;: these days,
&lt;a href=&quot;https://en.wikipedia.org/wiki/Software_supply_chain&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;a SBOM&lt;/a&gt; has
become recommended/mandatory, but it should not only consist in a documentation of external
dependencies and their versions, but also the full process of building a
runtime environment should be fully defined and consistent for the long
term.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Do not follow the last oh!-so-cool technology&lt;/em&gt;: while that could be done for
an amateur project to develop during spare time, it is not a good idea
depending on a technology whose future is not clearly stated, with a
well-established development team and proven sustainability in the long
term. I consider a risk even depending on a single company project, and even
more if it is considered a startup.  Synthetically, this can be generically
considered as minimalism in coding style.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Moreover, if you are going to use a well-established framework, such as Django,
for developing your mid-to-long-term web project, it is probably better than
using the latest Nodejs-based framework created six months ago that seems the
latest 'big thing'. But that's probably only common sense. Instead, ask yourself
if your project should be created from scratch using a simple &lt;em&gt;jamstack system&lt;/em&gt;
and some microservices for well-defined and minimal parts. In many cases, that
is more than enough for too many CMS-based sites out there: indeed, I
continuously ask myself why a lot of websites are still based on WordPress, when
most of them could be easily converted into a handful of static pages and simple
JavaScript snippets that they will use in any case. This can be declined in
terms of minimalism in defining computing architectures, which can also allow
scaling up applications more easily.&lt;/p&gt;&lt;p&gt;So minimalism principles can be considered at multiple levels: for programming
languages, libraries, architectures, and design. However, they require skills,
in-depth research, and a significant amount of time to dedicate to continuous
refactoring and meditation about viable alternatives. And that's probably the
key point: developers with deadlines and urgency imposed by PMs are too often
tempted to follow the easiest and richest paths and provide a solution of any
kind without too much meditation on the final balance among efforts, quality,
efficiency, and durability of results.&lt;/p&gt;&lt;p&gt;Of course, about minimalism, an extraordinary citation is due for the whole
&lt;a href=&quot;https://suckless.org&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;suckless effort&lt;/a&gt; on the uncompromising minimalism side.
And &lt;a href=&quot;https://motherfuckingwebsite.com/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;why not?&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Ok, ok, I'm joking. But you got the point.&lt;/p&gt;</content></entry><entry><title>Does HPC mean High-Pain Computing?</title><id>https://lovergine.com/does-hpc-mean-high-pain-computing.html</id><author><name>Francesco P. Lovergine</name><email>mbox@lovergine.com</email></author><updated>2025-09-06T19:40:00Z</updated><link href="https://lovergine.com/does-hpc-mean-high-pain-computing.html" rel="alternate" /><content type="html">&lt;p&gt;Please, forgive the silly joke in the title of this semi-serious post, but
lately I have been thinking about the strange fate of an area of general
computing that I have spent more and more time in recently, as in the near and
far past. For my job, I have utilized a series of scientific HPC clusters
worldwide to solve multiple computing problems most efficiently by distributing
computation across numerous nodes. Over the last thirty years, all such
platforms have consistently shared the same common characteristics, which
invariably pose a problem in their use for the average scientist
(often a young/junior dedicated to a short-term project) in any
application domain.&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;/images/high-pain-computing.jpg&quot; alt=&quot;HPC means high-pain computing&quot; /&gt;&lt;/p&gt;&lt;p&gt;To use Fred Brooks' definition, HPC technologies have both intrinsic and
incidental fallacies for such users category. The intrinsic one is due to the inner
complexity of creating a parallel and distributed solution to any problem,
possibly in a way that does not harm the final implementation due to the
increase in communication time among computational agents. This is already a
relevant problem &lt;em&gt;per se&lt;/em&gt;, which can often be out of the abilities, knowledge, and
interests of the average researcher in bioinformatics, physics, mathematics,
remote sensing, or whatever other research domain.&lt;/p&gt;&lt;p&gt;The incidental fallacy is instead always due to the accessibility of platforms and the
technologies used for their implementation. At large, all such HPC clusters are
a large pool of multi-core hosts with plenty of memory and connected with
multiple high-speed networks for implementing some sort of multi-tier
distributed POSIX file system and/or object storage.  Users can log in on a
limited number of such hosts that are connected to all others and run some type
of scheduling system (e.g., Slurm or HTcondor) where multiple computational nodes can
be reserved for a limited period of time to execute batch jobs or even an
interactive one (mainly for debugging). In most cases, such clusters can also be
used with some MPI/OpenMP implementations for proper parallel computational
modeling based on message passing among computing agents that run on multiple
cores and hosts, with or without multi-threading. Alternatively, GPUs can also
be reserved and exploited via Cuda/OpenCL. In many cases, such implementations
are vendor-oriented and trigger the need to adopt specific libraries and
compilers that add another layer of complexity to implementations.&lt;/p&gt;&lt;p&gt;The incidental problems start when the casual users discover that all such computing
nodes invariably run some legacy enterprise Linux distribution that is maintained
for a period of ten years or even more, until a full reinstallation of the whole
cluster. On top of such legacy systems (that are for
any practical use simply unusable as such) these scientific clusters give
essentially a few different mechanisms for creating a general computational
environment:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;https://modules.readthedocs.io/en/latest/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Environment Modules&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Containers (&lt;a href=&quot;https://sylabs.io/singularity/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Singularity&lt;/a&gt; or &lt;a href=&quot;https://apptainer.org/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Apptainer&lt;/a&gt;)&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://www.anaconda.com/&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Anaconda/Miniconda&lt;/a&gt;-like environment (or free forks like &lt;a href=&quot;https://github.com/conda-forge/miniforge&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Miniforge&lt;/a&gt;)&lt;/li&gt;&lt;li&gt;Some specific software/application to run&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;But for containers, the other solutions are all binary-based hubs, which could
expose them to possible breakages when the application developed needs to access
exotic language bindings for extensions, and the poor users enter the mysterious
and dangerous world of ABI violations and a chain of broken dependencies. Even,
often, such hubs are not always consistent, and any upgrade by the admin team
exposes them to sudden breakages from night to day.&lt;/p&gt;&lt;p&gt;The final solution (or apparently so) nowadays is using containerization and a
target environment where the user code can find all and only the correct
dependencies and versions for the whole software stack of the application. This,
at least, until the third-party hubs of base distributions and languages ensure
complete consistency and retain past binaries and versions for any
medium/long-term need. Of course, a full source-based stack with proper version
tracking &lt;em&gt;a la&lt;/em&gt; &lt;a href=&quot;https://lovergine.com/tags/guix.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;Guix&lt;/a&gt; would help to avoid
dependencies on external binary hubs and seems the way to go. Indeed, a small
group of interest in such a solution has existed for a few years, but I am
unaware of so many HPC clusters that consistently propose this kind of
implementation for users. That said, writing Guile Scheme descriptors for
preparing an execution environment may not be within the reach of the average
researcher in biochemistry or astrophysics.&lt;/p&gt;&lt;p&gt;Unfortunately, as I wrote
&lt;a href=&quot;https://lovergine.com/are-distributions-still-relevant.html&quot; target=&quot;_blank&quot; rel=&quot;noopener noreferrer&quot;&gt;in a past post&lt;/a&gt;
on this digital site, this moves the
whole responsibility of a software stack maintenance onto the shoulders of the
final users, who are often the infamous junior profiles I mentioned before.
These are non-IT specialists who should adopt such HPC platform to implement
solutions as part of their daily job in their special scientific domain.&lt;/p&gt;&lt;p&gt;The result, to be honest, is that the average researcher simply tries to avoid
the whole thing as soon as possible because of the significant complexity that
the entire thing involves, while the private sector introduced specialistic
roles of data and software engineers to manage such problems properly (which is
the only reasonable approach, indeed).  Adding insult to injury, in some
academic areas, such interests in HPC are also viewed with contempt or as a
waste of time, if not openly discouraged.&lt;/p&gt;&lt;p&gt;All this explains why a roundabout in any of the significant HPC clusters
worldwide often guarantees hilarious experiences in terms of who is doing what
and how.&lt;/p&gt;&lt;p&gt;Sometimes, I almost feel like I can hear them swearing...&lt;/p&gt;</content></entry></feed>