[Daniel's week] December 6, 2024
Daniel Stenberg
daniel at haxx.se
Fri Dec 6 16:16:05 CET 2024
Hello.
I'm back. Things happened.
## slowember
... is over. I was back on my work desk Monday morning and started working on
the list of piled up tasks that had accumulated for me over the last few
weeks. I kept up reasonably with my email and issue feeds even while away so I
had a pretty good idea about the state of things. Most important curl matters
were also properly managed by my awesome co-maintainers in the project in my
absence.
As a bonus follow-up effect of my jet lag is that I have woken up EARLY every
morning this whole week and have started working much earlier than I normally
do! It has helped me catch up faster.
## merge fest
As curl 8.11.1 is planned to ship next week I spend this week making sure I
merge all pending PRs and work that we want included in this release. Some of
them queued up in my absence. I think we might end up with a slightly lower
bugfix rate in this release cycle than we had in the previous few, largely
because of slowember.
Since this release cycle is done without any new features merged at all, it is
a somewhat of stability effort so I think it makes sense that it is a bit
calmer and "controlled". The previous curl release had just a tad too many
annoying regressions.
## AnyConnect
As I received another spike of emails this week regarding problems with the
Cisco product called AnyConnect I finally could not prevent myself from
blogging about it [1] since it does not seem to go away. Emails about this
product have been arriving in my mailbox since at least 2018 [2].
Amusingly, already the next day I got a similar email asking questions about
*another* VPN software...
## sscanf
When I first got into C coding back in the early 1990s I came to appreciate
the powers of the scanf() family of functions and I have written countless
number of lines of code using primarily the sscanf() function and I believe I
have a decent grip and understanding of how it works.
Over time however, the subtle side-effects that come with this function that
are hard to avoid (like how it allows multiple spaces if there is a single one
in the match pattern), that we must use separate buffers to store the result
and that there is no way to detect number overflows etc have made me realize
that this function is a little too hard to use properly, safely and
responsibly. Mistakes that are hard to spot happen too easily.
The consequence of this realization is that we have slowly been working on
reducing the sscanf() use in curl source code since a while back. To make our
string parsers stricter with better knowledge exactly what is matched and to
bail out correctly when the matching fails.
As I merged a few PRs to this affect this week, I created a new graph for the
dashboard that shows the sscanf use in curl over time [3] - and I have two PRs
pending: one that remove the last three sscanf() calls in libcurl, by
introducing a small set of string parser helper functions that I call strparse
[4] and then the follow-up one that bans sscanf use in curl code [5]. I will
wait until after the pending release before I merge these two though, to not
rock the boat too much just days before the release.
Once the strparse module gets merged, I should also go over other parsers in
curl and see if we can unify them better and use this new set parser functions
wider.
## split getparameter
Recently I mentioned my effort to reduce complexity of the "worst" functions
in curl. One of the functions on the top-list of complexity is the one called
getparameter in the curl tool: it holds the huge switch() for all the command
line options. As there are 266 different options (and slowly increasing) this
function is destined to remain fairly huge but this time I decided that for
each option that needs more than 15 lines of code to get parsed and handled, I
would split into a separate sub function.
In that single PR the complexity score for this big function decreased by 37
percent from 247 to 147. I think it become aesthetically more please when
eyeing the code as well. More manageable. It is still a huge function of
course.
## Rock-solid curl webinar
Thursday I did a Rock-solid curl webinar, and there is now a video recording
from that made available [5].
## 1.14 x War And Peace
I checked. This week, the curl product source code tree contained 1.14 times
the number of words as the massive Leo Tolstoy classic War And Peace.
## vulnerability reports
While I was away the curl security vulnerability report frequency has kept up,
and we see about 2-3 new ones per week now. A general rule as of recently is
that the share of reports that are accurately reporting a confirmed
vulnerability has decreased and is now somewhere around one in twenty.
The dismissed reports generally consists of three categories: the crazy ones,
the LLM lies and the ones that we after careful consideration deem to be a
"normal bug" instead of a security issue. Fortunately, the LLM craziness is
still the smaller of these by volume. The crazy ones tend to be things like
reporting that you can find the git repository (or other already public info)
on the web site, reports on code used only in the test suite, gross
misunderstandings of how C code works etc.
We will announce another low severity CVE in curl next week, in association
with the curl 8.11.1 release.
## store TLS sessions
We (well, Stefan to be specific) recently added support for TLS early data
(when using the GnuTLS backend). It is a way to send data to the server
earlier in the handshake, reducing latency and improving transfer speed for
clients. Going forward we plan to roll out and ship this feature for other TLS
backends and for QUIC connections as well.
To be able to use this feature, curl needs to have a cached TLS session ticket
from a previous connection with the server.
For users of curl on the command line, using this feature then becomes a bit
restricted since currently we only do TLS session caching in memory and with
the tool it exits often and that makes the use for TLS early data quite
limited.
The obvious remedy for this shortcoming is to make sure that we can save TLS
sessions to disk, so that repeated invokes can use previous sessions and with
that make faster handshakes and get speedier operations!
Stefan has drafted a proposal and thoughts [6] around what and how to save
this data that we of course value wide participation in and with so that we
get this right and as good as possible.
## criticality score
I browsed OpenSSF's list of projects and the "criticality score" [7] they
assign to them this week, and two facts from that data set stood out to me:
1. They list 561,454 Open Source projects in there now. Quite an astounding
number of separate projects.
2. curl is ranked as project 100 when all of the half a million projects are
sorted on their criticality score. The top ranked project in this list is,
perhaps not too surprisingly, the Linux kernel.
## configure
I happened to spot OpenSSF's newly posted suggestion on how to avoid future xz
attacks, which pretty much boils down to: "don't bundle configure scripts in
release tarballs".
I could not help myself to object to this [8], as it such a naive stand-point
and also counters the main purpose and usp for using autotools. I will
certainly continue to ship configure in projects.
## cookies
On the httpbis mailing list [9] the discussion has continued around HTTP
cookies, their specification and that interop is hard.
I don't think anyone disagrees with that, but there also is no easy fix and no
easy way out from the fact that cookies was shaky from the start and that we
are still seeing the long tail of countless implementations and that changing
this state is if not impossible, very hard and time-consuming.
The current cookie spec is RFC 6265 [10] that was published in 2011. The
update in the works for this spec is still only called 6265bis [11] and is
about to become an RFC any day now, but that one will not really fix any of
the problems (or should I say challenges? that exist with cookies.
There is even already a draft for another cookie spec revision [12] to get
done after post 6265bis. Although I could not spot any particular things in
this draft so far that actually addresses the issues mentioned in April King's
blog post [13] that kickstarted this recent discussion.
## Coming up
- Wednesday: curl 8.11.1 release + CVE-2024-11053 publication
- Wednesday: live-streamed release video at 09:00 UTC
- Thursday: advanced libcurl webinar
- I hope to get told if my proposed FOSDEM 2025 talk is accepted or not
## Links
[1] = https://daniel.haxx.se/blog/2024/12/03/no-need-to-email-me-about-cisco-anyconnect/
[2] = https://daniel.haxx.se/email/2018-09-17.html
[3] = https://curl.se/dashboard1.html#sscanf
[4] = https://github.com/curl/curl/pull/15692
[5] = https://github.com/curl/curl/pull/15687
[6] = https://github.com/curl/curl/discussions/15684
[7] = https://github.com/ossf/criticality_score
[8] = https://github.com/ossf/wg-best-practices-os-developers/pull/560
[9] = https://lists.w3.org/Archives/Public/ietf-http-wg/2024OctDec/thread.html
[10] = https://datatracker.ietf.org/doc/html/rfc6265
[11] = https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis-15
[12] = https://johannhof.github.io/draft-annevk-johannhof-httpbis-cookies/draft-annevk-johannhof-httpbis-cookies.html
[13] = https://grayduck.mn/2024/11/21/handling-cookies-is-a-minefield/
--
/ daniel.haxx.se
More information about the daniel
mailing list