July 29, 2020

Ole Aamot GNOME Development Blog

Record Live Audio as Ogg Vorbis in GNOME Gingerblue 0.2.0

Today I released GNOME Gingerblue version 0.2.0 with the basic new features:

I began work on GNOME Gingerblue on July 4th, 2018, two years ago and I am going to spend the next four years to complete it for GNOME 4.

GNOME Gingerblue will be a Free Software program for musicians who would compose, record and share original music to the Internet from the GNOME Desktop.

The project isn’t yet ready for distribution with GNOME 3 and the GUI and features such as meta tagging and Internet uploads must be implemented.

The GNOME release team complained at the early release cycle in July and call the project empty, but I estimate it will take at least 4 years to complete 4.0.0 in reasonable time for GNOME 4 to be released between 2020 and 2026.

The Internet community can’t have Free Music without Free Recording Software for GNOME, but GNOME 4 isn’t built in 1 day.

I am trying to get gtk_record_button_new() into GTK+ 4.0.

I hope to work more on the first major release of GNOME Gingerblue during Christmas 2020 and perhaps get meta tags working as a new feature in 1.0.0.

Meanwhile you can visit the GNOME Gingerblue project domain www.gingerblue.org with the GNOME wiki page, test the initial GNOME Gingerblue 0.2.0 release that writes and records Song files from the microphone in $HOME/Music/ with Wizard GUI and XML parsing from August 2018, or spend money on physical goods such as the Norsk Kombucha GingerBlue soda or the Ngs Ginger Blue 15.6″ laptop bag.

by oleaamot atJuly 29, 2020 06:00 PM

Peter Hansteen (That Grumpy BSD Guy)

Badness, Enumerated by Robots

A condensed summary of the blacklist data generated from traffic hitting bsdly.net and cooperating sites.

After my runbsd.info entry (previously bsdjobs.com) was posted, there has been an uptick in interest about the security related data generated at the bsdly.net site. I have written quite extensively about these issues earlier so I'll keep this piece short. If you want to go deeper, the field note-like articles I reference and links therein will offer some further insights.

There are three separate sets of downloadable data, all automatically generated and with only very occasional manual intervention.

Known spam sources during the last 24 hours

This is the list directly referenced in the BSDjobs.com piece.

This is a greytrapping based list, where the conditions for inclusion are simple: Attempts at delivery to known-bad addresses (download link here) in domains we handle mail for have happened within the last 24 hours.

In addition there will occasionally be some addresses added by cron jobs I run that pick the IP addresses of hosts that sent mail that made it through greylisting performed by our spamd(8) but did not pass the subsequent spamassassin or clamav treatment. The bsdly.net system is part of the bgp-spamd cooperation.

The traplist has a home page and at one point was furnished with a set of guidelines.

A partial history (the log starts 2017-05-20) of when spamtraps were added and from which sources can be found in this log (or at this alternate location). Read on for a bit of information on the alternate sources.

Misc other bots: SSH Password bruteforcing, malicious web activity, POP3 Password Bruteforcing.

The bruteforcers list is really a combination of several things, delivered as one file but with minimal scripting ability you should be able to dig out the distinct elements, described in this piece.

The (usually) largest chunk is a list of hosts that hit the rate limit for SSH connections described in the article or that was caught trying to log on as a non-existent user or other undesirable activity aimed at my sshd(8) service. Some as yet unpublished scriptery helps me feed the miscreants that the automatic processes do not catch into the table after a manual quality check.

The second part is a list of IP addresses that tried to access our web service in undesirable ways, including trying for specific URLs or files that will never be found at any world-facing part of our site.

After years of advocating short lifetimes (typically 24 hours) for blacklist entries only to see my logs fill up with attempts made at slightly slower speeds, I set the lifetime for entries in this data set to 28 days. The background including some war stories of monitoring SSH password groping can be found in this piece, while the more recent piece here covers some of the weeding out bad web activity.

The POP3 gropers list comes in two variations. Again lists of IP addresses caught trying to access a service, most of those accesses are to non-existent user names with an almost perfect overlap with the spamtraps list, local-part only (the part before the @ sign).

The big list is a complete corpus of IP addresses that have tried these kinds of accesses since I started recording and trapping them (see this piece for some early experience and this one for the start of the big collection).

There is also a smaller set, produced from the longterm table described in this piece. For much the same reason I did not stick to 24-hour expiry for the SSH list, this one has six-week expiry. With some minimal scriptery I run by hand one or two times per day, any invalid POP3 accesses to valid accounts get their IP adresses added to the longterm table and the exported list.

If you're wondering about the title, the term "enumerating badness" stems from Marcus Ranum's classic piece The Six Dumbest Ideas in Computer Security. Please do read that one.

Here are a few other references other than those referenced in the paragraphs above that you might find useful:

The Book of PF, 3rd edition
Hey, spammer! Here's a list for you! which contains the announcement of the bsdly.net traplist.
Effective Spam and Malware Countermeasures, a more complete treatment of those keywords

If you're interested in further information on any of this, the most useful contact information is in the comment blocks in the exported lists.

Update 2020-07-29: I added a direct link to the complete list of spamtraps, since the web page seemed a bit crowded to at least one visitor. Direct link again here for your convenience.

by Peter N. M. Hansteen (noreply@blogger.com) atJuly 29, 2020 10:18 AM

July 26, 2020

Ole Aamot GNOME Development Blog

GNOME Internet Radio Locator 3.0.2 for Fedora Core 32

GNOME Internet Radio Locator 3.0.1 (Washington)

GNOME Internet Radio Locator 3.0.2 features updated language translations, new, improved map marker palette and now also includes radio from Washington, United States of America; WAMU/NPR, London, United Kingdom; BBC World Service, Berlin, Germany; Radio Eins, Norway; NRK, and Paris, France; France Inter/Info/Culture, as well as 118 other radio stations from around the world with audio streaming implemented through GStreamer.  The project lives on www.gnomeradio.org and Fedora 32 RPM packages for version 3.0.2 of GNOME Internet Radio Locator are now also available:




To install GNOME Internet Radio Locator 3.0.2 on Fedora Core 32 in Terminal:

sudo dnf install http://www.gnomeradio.org/~ole/fedora/RPMS/x86_64/gnome-internet-radio-locator-3.0.2-1.fc32.x86_64.rpm

by oleaamot atJuly 26, 2020 12:00 PM

July 04, 2020

Petter Reinholdtsen

Working on updated Norwegian Bokmål edition of Debian Administrator's Handbook

Three years ago, the first Norwegian Bokmål edition of "The Debian Administrator's Handbook" was published. This was based on Debian Jessie. Now a new and updated version based on Buster is getting ready. Work on the updated Norwegian Bokmål edition has been going on for a few months now, and yesterday, we reached the first mile stone, with 100% of the texts being translated. A lot of proof reading remains, of course, but a major step towards a new edition has been taken.

The book is translated by volunteers, and we would love to get some help with the proof reading. The translation uses the hosted Weblate service, and we welcome everyone to have a look and submit improvements and suggestions. There is also a proof readers PDF available on request, get in touch if you want to help out that way.

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

July 04, 2020 09:55 PM

June 06, 2020

Petter Reinholdtsen

Secure Socket API - a simple and powerful approach for TLS support in software

As a member of the Norwegian Unix User Group, I have the pleasure of receiving the USENIX magazine ;login: several times a year. I rarely have time to read all the articles, but try to at least skim through them all as there is a lot of nice knowledge passed on there. I even carry the latest issue with me most of the time to try to get through all the articles when I have a few spare minutes.

The other day I came across a nice article titled "The Secure Socket API: TLS as an Operating System Service" with a marvellous idea I hope can make it all the way into the POSIX standard. The idea is as simple as it is powerful. By introducing a new socket() option IPPROTO_TLS to use TLS, and a system wide service to handle setting up TLS connections, one both make it trivial to add TLS support to any program currently using the POSIX socket API, and gain system wide control over certificates, TLS versions and encryption systems used. Instead of doing this:

int socket = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);

the program code would be doing this:

int socket = socket(PF_INET, SOCK_STREAM, IPPROTO_TLS);

According to the ;login: article, converting a C program to use TLS would normally modify only 5-10 lines in the code, which is amazing when compared to using for example the OpenSSL API.

The project has set up the https://securesocketapi.org/ web site to spread the idea, and the code for a kernel module and the associated system daemon is available from two github repositories: ssa and ssa-daemon. Unfortunately there is no explicit license information with the code, so its copyright status is unclear. A request to solve this about it has been unsolved since 2018-08-17.

I love the idea of extending socket() to gain TLS support, and understand why it is an advantage to implement this as a kernel module and system wide service daemon, but can not help to think that it would be a lot easier to get projects to move to this way of setting up TLS if it was done with a user space approach where programs wanting to use this API approach could just link with a wrapper library.

I recommend you check out this simple and powerful approach to more secure network connections. :)

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

June 06, 2020 10:40 AM

May 19, 2020

NUUG news

NUUG bygger bokskanner - arbeidet er i gang

Det finnes millioner av bøker der vernetiden er utløpt. Noen av dem er norske bøker, og endel av dem finnes ikke tilgjengelig digitalt. For å forsøke å gjøre noe med det siste, har NUUG vedtatt å få bygget en bokskanner. Utformingen er basert på en enkel variant i plast (byggeinstrukser), men vil bli laget i aluminium for lengre levetid.

Oppdraget med å bygge scanneren er gitt til våre venner i Oslo Sveisemek, som er godt igang med arbeidet. Her ser du en skisse over konstruksjonen:


Grunnrammen er montert, men det gjenstår fortsatt en god del:

Montering av grunnrammen

Tanken er at medlemmer og andre skal kunne låne eller leie bokskanner ved behov, og de av oss som er interessert kan gå igang med å digitalisere bøker med OCR og pågangsmot. Ta kontakt med aktive (at) nuug.no hvis dette er noe for deg, eller stikk innom #nuug.

(Fotograf er Jonny Birkelund)

May 19, 2020 06:00 PM

May 10, 2020

Peter Hansteen (That Grumpy BSD Guy)

The 'sextortion' Scams: The Numbers Show That What We Have Is A Failure Of Education

Subject: Your account was under attack! Change your credentials!
From: Melissa <chenbin@jw-hw.com>
To: adnan@bsdly.net


I am a hacker who has access to your operating system.

I also have full access to your account.

I've been watching you for a few months now.

The fact is that you were infected with malware through an adult site that you visited.

Did you receive a message phrased more or less like that, which then went on to say that they have a video of you performing an embarrasing activity while visiting an "adult" site, which they will send to all your contacts unless you buy Bitcoin and send to a specific ID?

The good news is that the video does not exist. I know this, because neither does our friend Adnan here. Despite that fact, whoever operates the account presenting as Melissa appears to believe that Adnan is indeed a person who can be blackmailed. You're probably safe for now. I will provide more detail later in the article, but first a few dos and don'ts:

The important point is that you are or were about to be the victim of what I consider a very obvious scam, and for no good or even nearly valid reason. You should not need to become the next victim.

And this, dear policy makers and tech heads in general is our problem: A large subset of the general public simply do not know their way around the digital world we created for them to live in. We need to do better.

In that context I find it quite disturbing that people who should know better, such as the Norwegian Center for Information Security, in a recently issued report (also see Digi.no's article (both in Norwegian only, sorry)) predict that the sextortion attacks will become "more sophisticated and credible". Then again at some level they may technically be right, since this kind of activity starts out with a net negative credibility score.

A case in point: Some versions of the scam messages I have been able to study went as far as to claim that the perpetrators had not only had taken control of the target's device, they had even sent that very email message from there. That never happened, of course, and it would have been easy for anybody who had learned to interpret Received: headers to verify that the message was in fact sent from the great elsewhere. Unfortunately the skill of reading email headers is rarely, if ever, taught to ordinary users.

The fact that people do not understand those -- to techies -- obvious facts is a fairly central and burdening problem, and again we need to do better.

Now let me explain. Things get incrementally more technical from here, so if you came here only for the admonitions or practical advice and have no use for the background, feel free to wander off.

I know the message I quoted at the beginning here is a scam because I run my own mail service, and looking at just the logs there just now I see that since the last logs archiving rotation early Saturday morning, more than 3000 attempts at delivery of messages like the one for Adnan happened, aimed at approximately 200 non-existent recipients before my logs tell me they finally tried to deliver one to my primary contact address, never actually landing in any inboxes.

One of the techniques we use to weed out unwanted incoming mail is to maintain and publish a list of known bad and invalid email addresses in our domains. These known bad addresses have then in ways unknown (at least not known to us in any detail) made it into the list of addresses sold to spammers, and we at the receiving end can use the bad addresses as triggers to block traffic from the sending hosts (If you are interested, you can read elsewhere on this blog for details on how we do this, look for tags such as greylisting, greytrapping or antispam).

If it was not clear earlier, those numbers tell us something about the messages at hand. It should be fairly obvious that compromising videos of non-existent users could not, in fact, exist.

Looking back in archived logs from the same system I see that a variant of this message started appearing in late January 2018. The specifics of that message sequence will be interesting to revisit when the full history of sextortion (I still do not like the term, but my preferred alterantive is at risk of being filtered out by polite society-serving robots) will be written, but let us rather turn to the more recent data, as in data recorded earlier this week.

Mainly because I found the media coverage of the "sextortion" phenomenon generally uninformed and somewhat annoying, I had been been mulling writing an article about it for a while, but I was still looking for a productive angle when on Wednesday evening I noticed a slight swelling in the number of greytrapped hosts. A glance at my spamd log seemed to indicate that at least one of the delivery attempts had a line like

       I am a hacker who has access to your operating system.

Which was actually just what I had been pondering writing about.  

So I set about for a little research. I greped (searched) in my yet-unrotated spamd logs for the word hacker, which yielded lots of lines of the type

Feb 22 04:04:35 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.
Feb 22 04:17:04 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.
Feb 22 04:34:03 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.
Feb 22 04:40:30 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.
Feb 22 04:55:04 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.
Feb 22 05:09:39 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.
Feb 22 05:13:22 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.
Feb 22 05:38:02 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.
Feb 22 05:44:39 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.
Feb 22 06:00:30 skapet spamd[8716]: Body: I am a hacker who has access to your operating system.

(the full result has been preserved here). Extracting the source addresses gave a list of 198 IP addresses (preserved here).

Extracting the To: addresses from the fuller listing yielded 192 unique email addresses (preserved here). Looking at the extracted target email addresses yielded some interesting insights:

1) The target email addresses were not exclusively in the domains my system actually serves, and

2) Some ways down the list of target email addresses, my own primary address turns up.

Of course 2) made me look a little closer, and only one IP address in the extract had tried delivery to my email address.

A further grep on that IP address turned up this result.

There are really no surprises to be had here, at least to a large subset of my supposed readers. The sender had first tried to deliver one of the sexstortion video messages to one of the by now more than quarter million spamtraps, and its IP address was still blacklisted by the time it finally tried delivery to a potentially deliverable address.

Doing a few spot checks on the sender IP addresses in recent and less recent logs it looks like the only two things could be mildly exciting about those messages. One is the degree the content was intended to be embarrasing to the recipient. The other is a possible indicator of the campaign's success: Looking back through the logs for the approximate year of known activity, it even looks like the campaign became multilingual, while retaining the word "hacker" in most if (possibly) not all language versions.

Other than that it is almost depressing how normal the sextortion campaign is: It uses the same spam sending infrastructure and the same low quality target address lists (the ones containing some subset of my spamtrap addresses) as the regular and likely not too successful spammers of every stripe. Nothing else stands out.

And as returning readers will notice, the logs indicate that the spambots are naive enough in their SMTP code that they frequently mistake spamd's delaying tactics for a slow, but functional open SMTP relay.

Now to recap the main points:
Whatever evolves next out of these rather hamfisted attempts at blackmail is unlikely to ever achieve any level of sophistication worthy of the name.

We would all be much better served by focusing on real threats such as, but not limited to, credential harvesting via deceptive content delivered over advertising networks, which themselves are a major headache security- and privacy-wise, or even harvesting via phishing email.

Both of the latter have been known to lead to successful compromise with data exfiltration and identity theft as possible-to-probable results.

To a large extent the damage could could have been significantly limited had the general public been taught sensible security practices such as using multi-factor authentication or at least actually good passwords combined with securely coded password management applications, and insisting that services encourage such practices.

Yes, I know you have been dying to ask: What is the thing about Adnan? According to my activity log, the address adnan@bsdly.net was added as a spamtrap on July 8th, 2017 after somebot had tried to log on as the user adnan, a user name not seen before at bsdly.net,

Jul  8 09:40:34 skapet sshd[34794]: Failed password for invalid user adnan from port 41091 ssh2

apparently from a network in South Korea.

As always, there is more log material available to competent practitioners and researchers with a valid research agenda. Please contact me if you are such a person who could use the collected data productively.

Update 2020-02-29: For completeness and because I felt that an unsophisticated attack like the present one deserves a thorough if unsophisticated analysis, I decided to take a look at the log data for the entire 7 day period, post-rotation.

So here comes some armchair analysis, using only the tools you will find in the base system of your OpenBSD machine or any other running a sensibly stocked unix-like operating systen. We start with finding the total number of delivery attempts logged where we have the body text 'am a hacker' (this would show up only after a sender has been blacklisted, so the gross number actual delivery attempts will likely be a tad higher), with the command

zgrep "am a hacker" /var/log/spamd.0.gz | awk '{print $6}' | wc -l

which tells us the number is 3372.

Next up we use a variation of the same command to extract the source IP addresses of the log entries that contain the string 'am a hacker', sort the result while also removing duplicates and store the end result in an environment variable called lastweek:

 export lastweek=`zgrep "am a hacker" /var/log/spamd.0.gz | awk '{print $6}' | tr -d ':' | sort -u `

With our list of IP addresses tucked away in the environment variable go on to: For each IP address in our lastweek set, extract all log entries and store the result (still in crude sort order by IP address), in the file 2020-02-29_i_am_hacker.raw.txt:

 for foo in $lastweek ; do zgrep $foo /var/log/spamd.0.gz | tee -a 2020-02-09_i_am_hacker.raw.txt ; done

For reference I kept the list of unique IP addresses (now totalling 231) around too.

Next, we are interested in extracting the target email addresses, so the command

grep "To:" 2020-02-29_i_am_hacker.raw.txt | awk '{print substr($0,index($0,$8))}' | sort -u

finds the lines in our original extract containing "To:", and gives us the list of target addresses the sources in our data set tried to deliver mail to.

The result is preserved as 2020-02-29_i_am_hacker.raw_targets.txt, a total of 236 addresses, mostly but not all in domains we actually host here. One surprise was that among the target addresses one actually invalid address turned up that was not at that time yet a spamtrap. See the end of the activity log for details (it also turned out to be the last SMTP entry in that log for 2020-02-29).

This little round of armchair analysis on the static data set confirms the conclusions from the original article: Apart from the possibly titillating aspects of the "adult" web site mentions and the attempt at playing on the target's potential shamefulness over specific actions, as spam campaigns go, this one is ordinary to the point of being a bit boring.

There may well be other actors preying on higher-value targets through their online clumsiness and known peculiarities of taste in an actually targeted fashion, but this is not it.

A final note on tools: In this article, like all previous entries, I have exclusively used the tools you will find in the OpenBSD (or other sensibly put together unixlike operating system) base system or at a stretch as an easily available package.

For the simpler, preliminary investigations and poking around like we have done here, the basic tools in the base system are fine. But if you will be performing log analysis at scale or with any regularity for purposes that influences your career path, I would encourage you to look into setting up a proper, purpose-built log analysis system.

Several good options, open source and otherwise, are available. I will not recommend or endorse any specific one, but when you find one that fits your needs and working style you will find that after the initial setup and learning period it will save you significant time.

As per my practice, only material directly relevant to the article itself has been published via the links. If you are a professional practitioner or researcher with who can state a valid reason to need access to unpublished material, please let me know and we will discuss your project.

Update 2020-03-02: I knew I had some early samples of messages that did make it to an inbox near me squirreled away somewhere, and after a bit of rummaging I found them, stored here (note the directory name, it seemed so obvious and transparent even back then). It appears that the oldest intact messages I have are from December 2018. I am sure earlier examples can be found if we look a littler harder.

Update 2020-03-17: A fresh example turned up this morning, addressed to (of all things) the postmaster account of one of our associated .no domains, written in Norwegian (and apparently generated with Microsoft Office software). The preserved message can be downloaded here

Update 2020-05-10: While rummaging about (aka 'researching') for something else I noticed that spamd logs were showing delivery attempts for messages with the subject "High level of danger. Your account was under attack."  So out of idle curiosity on an early Sunday afternoon, I did the following:

$ export muggles=`grep " High level of danger." /var/log/spamd | awk '{print $6}' | tr -d ':' | sort -u`
$ for foo in $muggles; do grep $foo /var/log/spamd >>20200510-muggles ; done

and the result is preserved for your entertainment and/or enlightenment here. Not much to see, really other than that they sent the message in two language varieties, and to a small subset of our imaginary friends.

by Peter N. M. Hansteen (noreply@blogger.com) atMay 10, 2020 10:55 AM

January 07, 2020

NUUG news

Noark tjenestegrensesnitt seminar mandag 27. januar 2019 kl. 08:30-11:00

Mandag 27. januar 2019 kl. 08:30-11:00 arrangerer OsloMet og NUUG en frokostseminar om Noark 5 tjenestegrensesnitt. Vi opplever at det er en del misforståelser rundt tjenestegrensesnittet og vi ønsker med dette å rydde opp i disse og sette fokus på viktigheten med standardisering.

Arkivene må ta sin plass i et datadrevet verden og standardisering og metadata er mer viktig nå enn noensinne. Ønsker du vite mer om hvordan standardisert dokumentasjonsforvaltning kan hjelpe deg unngå leverandørinnlåsing? Ønsker du å unngå opprettelsen av nye digitale siloer? Ønsker du på sikt å redusere arkiveringskostnadene? Bli med og finn ut mer hva et standardisert fremtidsrettet dokumentasjonsforvaltnings-API kan gjøre for deg.

Det er gratis å delta (frokost er på huset), men begrenset med plasser. Seminaret strømmes på nettet og opptak legges ut i etterkant.

Mer info og påmelding finnes på NUUGs arrangementsside.

January 07, 2020 12:30 PM

December 15, 2019

NUUG Foundation

Reisestipend - 2020

NUUG Foundation utlyser reisestipender for 2020. Søknader kan sendes inn til enhver tid.

December 15, 2019 09:46 AM

December 08, 2019

Nicolai Langfeldt

Bluray with menus on Linux - on Ubuntu

For the longest time it was impossible to play BluRay disks on Linux due to the lack of players that could do it.  VLC has been the most capable video player on Linux and some time ago they managed it.

I run Ubuntu at home.  I can easily install VLC but some parts were missing to get it working.
  1. Be root: sudo -i
  2. libaacs decodes Blurays: apt-get install libaacs0
  3. BluRays or at least VLC need Java 8: apt-get install openjdk-8-jre
  4. Ubuntu and VLC does not agree on the right directory name:  cd /usr/lib/jvm/
  5. Link the right one: ln -s java-1.8.0-openjdk-amd64 java-8-openjdk
  6. This library implements BD-J menus apt-get install libbluray-bdj
Now insert a BluRay disk and play it: vlc bluray://

It should start up with menus. Use arrow keys to navigate and Enter to choose.

by nicolai (noreply@blogger.com) atDecember 08, 2019 09:51 PM

December 02, 2018

NUUG Foundation

Reisestipend - 2019

NUUG Foundation utlyser reisestipender for 2019. Søknader kan sendes inn til enhver tid.

December 02, 2018 04:10 PM

November 14, 2018

Dag-Erling Smørgrav


Time for my annual “oh shit, I forgot to bump the copyright year again” round-up!

In the F/OSS community, there are two different philosophies when it comes to applying copyright statements to a project. If the code base consists exclusively (or almost exclusively) of code developed for that specific project by the project’s author or co-authors, many projects will have a single file (usually named LICENSE) containing the license, a list of copyright holders, and the copyright dates or ranges. However, if the code base incorporates a significant body of code taken from other projects or contributed by parties outside the project, it is customary to include the copyright statements and either the complete license or a reference to it in each individual file. In my experience, projects that use the BSD, ISC, MIT, adjacent licenses tend to use the latter model regardless.

The advantage of the second model is that it’s hard to get wrong. You might forget to add a name to a central list, but you’re far less likely to forget to state the name of the author when you add a new file. The disadvantage is that it’s really, really easy to forget to update the copyright year when you commit a change to an existing file that hasn’t been touched in a while.

So, how can we automate this?

One possibility is to have a pre-commit hook that updates it for you (generally a bad idea), or one that rejects the commit if it thinks you forgot (better, but not perfect; what if you’re adding a file from an outside source?), or one that prints a big fat warning if it thinks you forgot (much better, especially with Git since you can commit --amend once you’ve fixed it, before pushing).

But how do you fix the mistake retroactively, without poring over commit logs to figure out what was modified when?

Let’s start by assuming that you have a list of files that were modified in 2017, and that each file only has one copyright statement that needs to be updated to reflect that fact. The following Perl one-liner should do the trick:

perl -p -i -e 'if (/Copyright/) { s/ ([0-9]{4})-20(?:0[0-9]|1[0-6]) / $1-2017 /; s/ (20(?:0[0-9]|1[0-6])) / $1-2017 /; }'

It should be fairly self-explanatory if you know regular expressions. The first substitution handles the case where the existing statement contains a range, in which case we extend it to include 2017, and the second substitution handles the case where the existing statement contains a single year, which we replace with a range starting with the original year and ending with 2017. The complexity stems mostly from having to take care not to replace 2018 (or later) with 2017; our regexes only match years in the range 2000-2016.

OK, so now we know how to fix the files, but how do we figure out which ones need fixing?

With Git, we could try something like this:

git diff --name-only 'HEAD@{2017-01-01}..HEAD@{2018-01-01}'

This is… imperfect, though. The first problem is that it will list every file that was touched, including files that were added, moved, renamed, or deleted. Files that were added should be assumed to have had a correct copyright statement at the time they were added; files that were only moved or renamed should not be updated, since their contents did not change; and files that were deleted are no longer there to be updated.¹ We should restrict our search to files that were actually modified:

git diff --name-only --diff-filter M 'HEAD@{2017-01-01}..HEAD@{2018-01-01}'

Some of those changes might be too trivial to copyright, though. This is a fairly complex legal matter, but to simplify, if the change was inevitable and there was no room for creative expression — for instance, a function in a third-party library you are using was renamed, so both the reason for the change and the nature of the change are external to the work itself — then it is not protected. So perhaps you should remove --name-only and review the diff, which is when you realize that half those files were only modified to update their copyright statements because you forgot to do so in 2016. Let’s try to exclude them mechanically, rather than manually. Unfortunately, git diff does not have anything that resembles diff -I, so we have to write our own diff command which does that, and ask git to use it:

$ echo 'exec diff -u -ICopyright "$@"' >diff-no-copyright
$ chmod a+rx diff-no-copyright
$ git difftool --diff-filter M --extcmd $PWD/diff-no-copyright 'HEAD@{2017-01-01}..HEAD@{2018-01-01}'

This gives us a diff, though, not a list of files. We can try to extract the names as follows:

$ git difftool --diff-filter M --no-prompt --extcmd $PWD/diff-no-copyright 'HEAD@{2017-01-01}..HEAD@{2018-01-01}' | awk '/^---/ { print $2 }'

Wait… no, that’s just garbage. The thing is, git difftool works by checking out both versions of a file and diffing them, so what we get is a list of the names of the temporary files it created. We have to be a little more creative:

$ echo '/usr/bin/diff -q -ICopyright "$@" >/dev/null || echo "$BASE"' >list-no-copyright
$ chmod a+rx list-no-copyright
$ git difftool --diff-filter M --no-prompt --extcmd $PWD/list-no-copyright 'HEAD@{2017-01-01}..HEAD@{2018-01-01}'

Much better. We can glue this together with our Perl one-liner using xargs, then repeat the process for 2018.

Finally, how about Subversion? On the one hand, Subversion is far simpler than Git, so we can get 90% of the way much more easily. On the other hand, Subversion is far less flexible than Git, so we can’t go the last 10% of the way. Here’s the best I could do:

$ echo 'exec diff -u -ICopyright "$@"' >diff-no-copyright
$ chmod a+rx diff-no-copyright
$ svn diff --ignore-properties --diff-cmd $PWD/diff-no-copyright -r'{2017-01-01}:{2018-01-01}' | awk '/^---/ { print $2 }'

This will not work properly if you have files with names that contain whitespace; you’ll have to use sed with a much more complicated regex, which I leave as an exercise.

¹ I will leave the issue of move-and-modify being incorrectly recorded as delete-and-add to the reader. One possibility is to include added files in the list by using --diff-filter AM, and review them manually before committing.

by Dag-Erling Smørgrav atNovember 14, 2018 07:11 PM

October 22, 2018

Dag-Erling Smørgrav

DNS over TLS in FreeBSD 12

With the arrival of OpenSSL 1.1.1, an upgraded Unbound, and some changes to the setup and init scripts, FreeBSD 12.0, currently in beta, now supports DNS over TLS out of the box.

DNS over TLS is just what it sounds like: DNS over TCP, but wrapped in a TLS session. It encrypts your requests and the server’s replies, and optionally allows you to verify the identity of the server. The advantages are protection against eavesdropping and manipulation of your DNS traffic; the drawbacks are a slight performance degradation and potential firewall traversal issues, as it runs over a non-standard port (TCP port 853) which may be blocked on some networks. Let’s take a look at how to set it up.

Basic setup

As a simple test case, let’s set up our 12.0-ALPHA10 VM to use Cloudflare’s DNS service:

# uname -r
# cat >/etc/rc.conf.d/local_unbound <<EOF
# service local_unbound start
Performing initial setup.
/var/unbound/forward.conf created
/var/unbound/lan-zones.conf created
/var/unbound/control.conf created
/var/unbound/unbound.conf created
/etc/resolvconf.conf not modified
Original /etc/resolv.conf saved as /var/backups/resolv.conf.20181021.192629
Starting local_unbound.
Waiting for nameserver to start... good
# host www.freebsd.org
www.freebsd.org is an alias for wfe0.nyi.freebsd.org.
wfe0.nyi.freebsd.org has address
wfe0.nyi.freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
wfe0.nyi.freebsd.org mail is handled by 0 .

Note that this is not a configuration you want to run in production—we will come back to this later.


The downside of DNS over TLS is the performance hit of the TCP and TLS session setup and teardown. We demonstrate this by flushing our cache and (rather crudely) measuring a cache miss and a cache hit:

# local-unbound-control reload
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.553 total
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.005 total

Compare this to querying our router, a puny Soekris net5501 running Unbound 1.8.1 on FreeBSD 11.1-RELEASE:

# time host www.freebsd.org gw >x
host www.freebsd.org gw > x 0.00s user 0.00s system 0% cpu 0.232 total
# time host www.freebsd.org >x
host www.freebsd.org gw > x 0.00s user 0.00s system 0% cpu 0.008 total

or to querying Cloudflare directly over UDP:

# time host www.freebsd.org >x      
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.272 total
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.013 total

(Cloudflare uses anycast routing, so it is not so unreasonable to see a cache miss during off-peak hours.)

This clearly shows the advantage of running a local caching resolver—it absorbs the cost of DNSSEC and TLS. And speaking of DNSSEC, we can separate that cost from that of TLS by reconfiguring our server without the latter:

# cat >/etc/rc.conf.d/local_unbound <<EOF
# service local_unbound setup
Performing initial setup.
Original /var/unbound/forward.conf saved as /var/backups/forward.conf.20181021.205328
/var/unbound/lan-zones.conf not modified
/var/unbound/control.conf not modified
Original /var/unbound/unbound.conf saved as /var/backups/unbound.conf.20181021.205328
/etc/resolvconf.conf not modified
/etc/resolv.conf not modified
# service local_unbound start
Starting local_unbound.
Waiting for nameserver to start... good
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.080 total
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.004 total

So does TLS add nearly half a second to every cache miss? Not quite, fortunately—in our previous tests, our first query was not only a cache miss but also the first query after a restart or a cache flush, resulting in a complete load and validation of the entire path from the name we queried to the root. The difference between a first and second cache miss is quite noticeable:

# time host www.freebsd.org >x 
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.546 total
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.004 total
# time host repo.freebsd.org >x
host repo.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.168 total
# time host repo.freebsd.org >x
host repo.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.004 total

Revisiting our configuration

Remember when I said that you shouldn’t run the sample configuration in production, and that I’d get back to it later? This is later.

The problem with our first configuration is that while it encrypts our DNS traffic, it does not verify the identity of the server. Our ISP could be routing all traffic to to its own servers, logging it, and selling the information to the highest bidder. We need to tell Unbound to validate the server certificate, but there’s a catch: Unbound only knows the IP addresses of its forwarders, not their names. We have to provide it with names that will match the x509 certificates used by the servers we want to use. Let’s double-check the certificate:

# :| openssl s_client -connect |& openssl x509 -noout -text |& grep DNS
DNS:*.cloudflare-dns.com, IP Address:, IP Address:, DNS:cloudflare-dns.com, IP Address:2606:4700:4700:0:0:0:0:1111, IP Address:2606:4700:4700:0:0:0:0:1001

This matches Cloudflare’s documentation, so let’s update our configuration:

# cat >/etc/rc.conf.d/local_unbound <<EOF
# service local_unbound setup
Performing initial setup.
Original /var/unbound/forward.conf saved as /var/backups/forward.conf.20181021.212519
/var/unbound/lan-zones.conf not modified
/var/unbound/control.conf not modified
/var/unbound/unbound.conf not modified
/etc/resolvconf.conf not modified
/etc/resolv.conf not modified
# service local_unbound restart
Stopping local_unbound.
Starting local_unbound.
Waiting for nameserver to start... good
# host www.freebsd.org
www.freebsd.org is an alias for wfe0.nyi.freebsd.org.
wfe0.nyi.freebsd.org has address
wfe0.nyi.freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
wfe0.nyi.freebsd.org mail is handled by 0 .

How can we confirm that Unbound actually validates the certificate? Well, we can run Unbound in debug mode (/usr/sbin/unbound -dd -vvv) and read the debugging output… or we can confirm that it fails when given a name that does not match the certificate:

# perl -p -i -e 's/cloudflare/cloudfire/g' /etc/rc.conf.d/local_unbound
# service local_unbound setup
Performing initial setup.
Original /var/unbound/forward.conf saved as /var/backups/forward.conf.20181021.215808
/var/unbound/lan-zones.conf not modified
/var/unbound/control.conf not modified
/var/unbound/unbound.conf not modified
/etc/resolvconf.conf not modified
/etc/resolv.conf not modified
# service local_unbound restart
Stopping local_unbound.
Waiting for PIDS: 33977.
Starting local_unbound.
Waiting for nameserver to start... good
# host www.freebsd.org
Host www.freebsd.org not found: 2(SERVFAIL)

But is this really a failure to validate the certificate? Actually, no. When provided with a server name, Unbound will pass it to the server during the TLS handshake, and the server will reject the handshake if that name does not match any of its certificates. To truly verify that Unbound validates the server certificate, we have to confirm that it fails when it cannot do so. For instance, we can remove the root certificate used to sign the DNS server’s certificate from the test system’s trust store. Note that we cannot simply remove the trust store entirely, as Unbound will refuse to start if the trust store is missing or empty.

While we’re talking about trust stores, I should point out that you currently must have ca_root_nss installed for DNS over TLS to work. However, 12.0-RELEASE will ship with a pre-installed copy.


We’ve seen how to set up Unbound—specifically, the local_unbound service in FreeBSD 12.0—to use DNS over TLS instead of plain UDP or TCP, using Cloudflare’s public DNS service as an example. We’ve looked at the performance impact, and at how to ensure (and verify) that Unbound validates the server certificate to prevent man-in-the-middle attacks.

The question that remains is whether it is all worth it. There is undeniably a performance hit, though this may improve with TLS 1.3. More importantly, there are currently very few DNS-over-TLS providers—only one, really, since Quad9 filter their responses—and you have to weigh the advantage of encrypting your DNS traffic against the disadvantage of sending it all to a single organization. I can’t answer that question for you, but I can tell you that the parameters are evolving quickly, and if your answer is negative today, it may not remain so for long. More providers will appear. Performance will improve with TLS 1.3 and QUIC. Within a year or two, running DNS over TLS may very well become the rule rather than the experimental exception.

by Dag-Erling Smørgrav atOctober 22, 2018 09:36 AM

July 10, 2018

Nicolai Langfeldt

Epost er så 1995!

Her om dagen ble jeg gjort oppmerksom på at friprog senteret strever litt, de gjør greie for det i Farvel epost.

At det skal være lettere å følge opp henvendelser på twitter/linkedin/facebook virker mildest talt merkelig. At det skal gjøre det lettere for dem å ignorere eller svare nei på henvendelser de burde ignorere eller svare nei på virker også merkelig. Kan ikke tro at det vil gjøre bildet av henvendelser og hva som er svart på mindre oversiktlig. Status sefæren er et sosialt rom, ikke egentlig et saks- og henvendelses-behandlings-rom. Antar uten videre at de som bruker twitter/facebook/... seriøst til slikt sørger for å hente henvendelsene inn i saks- og henvendelses-systemet sitt så de kan se hva de har tatt stilling til og behandlet.

Nuvel, spent på hva de må gjøre for at dette skal lykkes - for andre verdier av "lykkes" enn "jeg følger ikke med på twitter" >:-)

by nicolai (noreply@blogger.com) atJuly 10, 2018 05:54 AM

October 23, 2017

Espen Braastad

ZFS NAS using CentOS 7 from tmpfs

Following up on the CentOS 7 root filesystem on tmpfs post, here comes a guide on how to run a ZFS enabled CentOS 7 NAS server (with the operating system) from tmpfs.


Preparing the build environment

The disk image is built in macOS using Packer and VirtualBox. Virtualbox is installed using the appropriate platform package that is downloaded from their website, and Packer is installed using brew:

$ brew install packer

Building the disk image

Three files are needed in order to build the disk image; a Packer template file, an Anaconda kickstart file and a shell script that is used to configure the disk image after installation. The following files can be used as examples:

Create some directories:

$ mkdir ~work/centos-7-zfs/
$ mkdir ~work/centos-7-zfs/http/
$ mkdir ~work/centos-7-zfs/scripts/

Copy the files to these directories:

$ cp template.json ~work/centos-7-zfs/
$ cp ks.cfg ~work/centos-7-zfs/http/
$ cp provision.sh ~work/centos-7-zfs/scripts/

Modify each of the files to fit your environment.

Start the build process using Packer:

$ cd ~work/centos-7-zfs/
$ packer build template.json

This will download the CentOS 7 ISO file, start an HTTP server to serve the kickstart file and start a virtual machine using Virtualbox:

Packer installer screenshot

The virtual machine will boot into Anaconda and run through the installation process as specified in the kickstart file:

Anaconda installer screenshot

When the installation process is complete, the disk image will be available in the output-virtualbox-iso folder with the vmdk extension.

Packer done screenshot

The disk image is now ready to be put in initramfs.

Putting the disk image in initramfs

This section is quite similar to the previous blog post CentOS 7 root filesystem on tmpfs but with minor differences. For simplicity reasons it is executed on a host running CentOS 7.

Create the build directories:

$ mkdir /work
$ mkdir /work/newroot
$ mkdir /work/result

Export the files from the disk image to one of the directories we created earlier:

$ export LIBGUESTFS_BACKEND=direct
$ guestfish --ro -a packer-virtualbox-iso-1508790384-disk001.vmdk -i copy-out / /work/newroot/

Modify /etc/fstab:

$ cat > /work/newroot/etc/fstab << EOF
tmpfs       /         tmpfs    defaults,noatime 0 0
none        /dev      devtmpfs defaults         0 0
devpts      /dev/pts  devpts   gid=5,mode=620   0 0
tmpfs       /dev/shm  tmpfs    defaults         0 0
proc        /proc     proc     defaults         0 0
sysfs       /sys      sysfs    defaults         0 0

Disable selinux:

echo "SELINUX=disabled" > /work/newroot/etc/selinux/config

Disable clearing the screen on login failure to make it possible to read any error messages:

mkdir /work/newroot/etc/systemd/system/getty@.service.d
cat > /work/newroot/etc/systemd/system/getty@.service.d/noclear.conf << EOF

Now jump to the Initramfs and Result sections in the CentOS 7 root filesystem on tmpfs and follow those steps until the end when the result is a vmlinuz and initramfs file.

ZFS configuration

The first time the NAS server boots on the disk image, the ZFS storage pool and volumes will have to be configured. Refer to the ZFS documentation for information on how to do this, and use the following command only as guidelines.

Create the storage pool:

$ sudo zpool create data mirror sda sdb mirror sdc sdd

Create the volumes:

$ sudo zfs create data/documents
$ sudo zfs create data/games
$ sudo zfs create data/movies
$ sudo zfs create data/music
$ sudo zfs create data/pictures
$ sudo zfs create data/upload

Share some volumes using NFS:

zfs set sharenfs=on data/documents
zfs set sharenfs=on data/games
zfs set sharenfs=on data/music
zfs set sharenfs=on data/pictures

Print the storage pool status:

$ sudo zpool status
  pool: data
 state: ONLINE
  scan: scrub repaired 0B in 20h22m with 0 errors on Sun Oct  1 21:04:14 2017

	data        ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    sdd     ONLINE       0     0     0
	    sdc     ONLINE       0     0     0
	  mirror-1  ONLINE       0     0     0
	    sda     ONLINE       0     0     0
	    sdb     ONLINE       0     0     0

errors: No known data errors

October 23, 2017 11:20 PM

February 13, 2017

Mimes brønn

En innsynsbrønn full av kunnskap

Mimes brønn er en nettjeneste som hjelper deg med å be om innsyn i offentlig forvaltning i tråd med offentleglova og miljøinformasjonsloven. Tjenesten har et offentlig tilgjengelig arkiv over alle svar som er kommet på innsynsforespørsler, slik at det offentlige kan slippe å svare på de samme innsynshenvendelsene gang på gang. Du finner tjenesten på


I følge gammel nordisk mytologi voktes kunnskapens kilde av Mime og ligger under en av røttene til verdenstreet Yggdrasil. Å drikke av vannet i Mimes brønn ga så verdifull kunnskap og visdom at den unge guden Odin var villig til å gi et øye i pant og bli enøyd for å få lov til å drikke av den.

Nettstedet vedlikeholdes av foreningen NUUG og er spesielt godt egnet for politisk interesserte personer, organisasjoner og journalister. Tjenesten er basert på den britiske søstertjenesten WhatDoTheyKnow.com, som allerede har gitt innsyn som har resultert i dokumentarer og utallige presseoppslag. I følge mySociety for noen år siden gikk ca 20 % av innsynshenvendelsene til sentrale myndigheter via WhatDoTheyKnow. Vi i NUUG håper NUUGs tjeneste Mimes brønn kan være like nyttig for innbyggerne i Norge.

I helgen ble tjenesten oppdatert med mye ny funksjonalitet. Den nye utgaven fungerer bedre på små skjermer, og viser nå leveringsstatus for henvendelsene slik at innsender enklere kan sjekke at mottakers epostsystem har bekreftet mottak av innsynshenvendelsen. Tjenesten er satt opp av frivillige i foreningen NUUG på dugnad, og ble lansert sommeren 2015. Siden den gang har 121 brukere sendt inn mer enn 280 henvendelser om alt fra bryllupsutleie av Operaen og forhandlinger om bruk av Norges topp-DNS-domene .bv til journalføring av søknader om bostøtte, og nettstedet er en liten skattekiste av interessant og nyttig informasjon. NUUG har knyttet til seg jurister som kan bistå med å klage på manglende innsyn eller sviktende saksbehandling.

– «NUUGs Mimes brønn var uvurderlig da vi lyktes med å sikre at DNS-toppdomenet .bv fortsatt er på norske hender,» forteller Håkon Wium Lie.

Tjenesten dokumenterer svært sprikende praksis i håndtering av innsynshenvendelser, både når det gjelder responstid og innhold i svarene. De aller fleste håndteres raskt og korrekt, men det er i flere tilfeller gitt innsyn i dokumenter der ansvarlig etat i ettertid ønsker å trekke innsynet tilbake, og det er gitt innsyn der sladdingen har vært utført på en måte som ikke skjuler informasjonen som skal sladdes.

– «Offentlighetsloven er en bærebjelke for vårt demokrati. Den bryr seg ikke med hvem som ber om innsyn, eller hvorfor. Prosjektet Mimes brønn innebærer en materialisering av dette prinsippet, der hvem som helst kan be om innsyn og klage på avslag, og hvor dokumentasjon gjøres offentlig. Dette gjør Mimes Brønn til et av de mest spennende åpenhetsprosjektene jeg har sett i nyere tid.» forteller mannen som fikk åpnet opp eierskapsregisteret til skatteetaten, Vegard Venli.

Vi i foreningen NUUG håper Mimes brønn kan være et nyttig verktøy for å holde vårt demokrati ved like.

by Mimes Brønn atFebruary 13, 2017 02:07 PM

January 06, 2017

Espen Braastad

CentOS 7 root filesystem on tmpfs

Several years ago I wrote a series of posts on how to run EL6 with its root filesystem on tmpfs. This post is a continuation of that series, and explains step by step how to run CentOS 7 with its root filesystem in memory. It should apply to RHEL, Ubuntu, Debian and other Linux distributions as well. The post is a bit terse to focus on the concept, and several of the steps have potential for improvements.

The following is a screen recording from a host running CentOS 7 in tmpfs:


Build environment

A build host is needed to prepare the image to boot from. The build host should run CentOS 7 x86_64, and have the following packages installed:

yum install libvirt libguestfs-tools guestfish

Make sure the libvirt daemon is running:

systemctl start libvirtd

Create some directories that will be used later, however feel free to relocate these to somewhere else:

mkdir -p /work/initramfs/bin
mkdir -p /work/newroot
mkdir -p /work/result

Disk image

For simplicity reasons we’ll fetch our rootfs from a pre-built disk image, but it is possible to build a custom disk image using virt-manager. I expect that most people would like to create their own disk image from scratch, but this is outside the scope of this post.

Use virt-builder to download a pre-built CentOS 7.3 disk image and set the root password:

virt-builder centos-7.3 -o /work/disk.img --root-password password:changeme

Export the files from the disk image to one of the directories we created earlier:

guestfish --ro -a /work/disk.img -i copy-out / /work/newroot/

Clear fstab since it contains mount entries that no longer apply:

echo > /work/newroot/etc/fstab

SELinux will complain about incorrect disk label at boot, so let’s just disable it right away. Production environments should have SELinux enabled.

echo "SELINUX=disabled" > /work/newroot/etc/selinux/config

Disable clearing the screen on login failure to make it possible to read any error messages:

mkdir /work/newroot/etc/systemd/system/getty@.service.d
cat > /work/newroot/etc/systemd/system/getty@.service.d/noclear.conf << EOF


We’ll create our custom initramfs from scratch. The boot procedure will be, simply put:

  1. Fetch kernel and a custom initramfs.
  2. Execute kernel.
  3. Mount the initramfs as the temporary root filesystem (for the kernel).
  4. Execute /init (in the initramfs).
  5. Create a tmpfs mount point.
  6. Extract our CentOS 7 root filesystem to the tmpfs mount point.
  7. Execute switch_root to boot on the CentOS 7 root filesystem.

The initramfs will be based on BusyBox. Download a pre-built binary or compile it from source, put the binary in the initramfs/bin directory. In this post I’ll just download a pre-built binary:

wget -O /work/initramfs/bin/busybox https://www.busybox.net/downloads/binaries/1.26.1-defconfig-multiarch/busybox-x86_64

Make sure that busybox has the execute bit set:

chmod +x /work/initramfs/bin/busybox

Create the file /work/initramfs/init with the following contents:

#!/bin/busybox sh

# Dump to sh if something fails
error() {
	echo "Jumping into the shell..."
	setsid cttyhack sh

# Populate /bin with binaries from busybox
/bin/busybox --install /bin

mkdir -p /proc
mount -t proc proc /proc

mkdir -p /sys
mount -t sysfs sysfs /sys

mkdir -p /sys/dev
mkdir -p /var/run
mkdir -p /dev

mkdir -p /dev/pts
mount -t devpts devpts /dev/pts

# Populate /dev
echo /bin/mdev > /proc/sys/kernel/hotplug
mdev -s

mkdir -p /newroot
mount -t tmpfs -o size=1500m tmpfs /newroot || error

echo "Extracting rootfs... "
xz -d -c -f rootfs.tar.xz | tar -x -f - -C /newroot || error

mount --move /sys /newroot/sys
mount --move /proc /newroot/proc
mount --move /dev /newroot/dev

exec switch_root /newroot /sbin/init || error

Make sure it is executable:

chmod +x /work/initramfs/init

Create the root filesystem archive using tar. The following command also uses xz compression to reduce the final size of the archive (from approximately 1 GB to 270 MB):

cd /work/newroot
tar cJf /work/initramfs/rootfs.tar.xz .

Create initramfs.gz using:

cd /work/initramfs
find . -print0 | cpio --null -ov --format=newc | gzip -9 > /work/result/initramfs.gz

Copy the kernel directly from the root filesystem using:

cp /work/newroot/boot/vmlinuz-*x86_64 /work/result/vmlinuz


The /work/result directory now contains two files with file sizes similar to the following:

ls -lh /work/result/
total 277M
-rw-r--r-- 1 root root 272M Jan  6 23:42 initramfs.gz
-rwxr-xr-x 1 root root 5.2M Jan  6 23:42 vmlinuz

These files can be loaded directly in GRUB from disk, or using iPXE over HTTP using a script similar to:

kernel http://example.com/vmlinuz
initrd http://example.com/initramfs.gz

January 06, 2017 08:34 PM

July 15, 2016

Mimes brønn

Hvem har drukket fra Mimes brønn?

Mimes brønn har nå vært oppe i rundt et år. Derfor vi tenkte det kunne være interessant å få en kortfattet statistikk om hvordan tjenesten er blitt brukt.

I begynnelsen av juli 2016 hadde Mimes brønn 71 registrerte brukere som hadde sendt ut 120 innsynshenvendelser, hvorav 62 (52%) var vellykkede, 19 (16%) delvis vellykket, 14 (12%) avslått, 10 (8%) fikk svar at organet ikke hadde informasjonen, og 12 henvendelser (10%; 6 fra 2016, 6 fra 2015) fortsatt var ubesvarte. Et fåtall (3) av hendvendelsene kunne ikke kategoriseres. Vi ser derfor at rundt to tredjedeler av henvendelsene var vellykkede, helt eller delvis. Det er bra!

Tiden det tar før organet først sender svar varierer mye, fra samme dag (noen henvendelser sendt til Utlendingsnemnda, Statens vegvesen, Økokrim, Mediatilsynet, Datatilsynet, Brønnøysundregistrene), opp til 6 måneder (Ballangen kommune) eller lenger (Stortinget, Olje- og energidepartementet, Justis- og beredskapsdepartementet, UDI – Utlendingsdirektoratet, og SSB har mottatt innsynshenvendelser som fortsatt er ubesvarte). Gjennomsnittstiden her var et par uker (med unntak av de 12 tilfellene der det ikke har kommet noe svar). Det følger av offentlighetsloven § 29 første ledd at henvendelser om innsyn i forvaltningens dokumenter skal besvares «uten ugrunnet opphold», noe som ifølge Sivilombudsmannen i de fleste tilfeller skal fortolkes som «samme dag eller i alle fall i løpet av 1-3 virkedager». Så her er det rom for forbedring.

Klageretten (offentleglova § 32) ble benyttet i 20 av innsynshenvendelsene. I de fleste (15; 75%) av tilfellene førte klagen til at henvendelsen ble vellykket. Gjennomsnittstiden for å få svar på klagen var en måned (med unntak av 2 tillfeller, klager sendt til Statens vegvesen og Ruter AS, der det ikke har kommet noe svar). Det er vel verdt å klage, og helt gratis! Sivilombudsmannen har uttalt at 2-3 uker ligger over det som er akseptabel saksbehandlingstid for klager.

Flest henvendelser var blitt sendt til Utenriksdepartementet (9), tett etterfulgt av Fredrikstad kommune og Brønnøysundregistrene. I alt ble henvendelser sendt til 60 offentlige myndigheter, hvorav 27 ble tilsendt to eller flere. Det står over 3700 myndigheter i databasen til Mimes brønn. De fleste av dem har dermed til gode å motta en innsynshenvendelse via tjenesten.

Når vi ser på hva slags informasjon folk har bedt om, ser vi et bredt spekter av interesser; alt fra kommunens parkeringsplasser, reiseregninger der statens satser for overnatting er oversteget, korrespondanse om asylmottak og forhandlinger om toppdomenet .bv, til dokumenter om Myanmar.

Myndighetene gjør alle mulige slags ting. Noe av det gjøres dÃ¥rlig, noe gjør de bra. Jo mer vi finner ut om hvordan  myndighetene fungerer, jo større mulighet har vi til Ã¥ foreslÃ¥ forbedringer pÃ¥ det som fungerer dÃ¥rlig… og applaudere det som  bra.  Er det noe du vil ha innsyn i, sÃ¥ er det bare Ã¥ klikke pÃ¥ https://www.mimesbronn.no/ og sÃ¥ er du i gang 🙂

by Mimes Brønn atJuly 15, 2016 03:56 PM

June 01, 2016

Kevin Brubeck Unhammer

Maskinomsetjing vs NTNU-eksaminator

Twitter-brukaren @IngeborgSteine fekk nyleg ein del merksemd då ho tvitra eit bilete av nynorskutgåva av økonomieksamenen sin ved NTNU:

Dette var min økonomieksamen på "nynorsk". #nynorsk #noregsmållag #kvaialledagar https://t.co/RjCKSU2Fyg
Ingeborg Steine (@IngeborgSteine) May 30, 2016

Kreative nyvinningar som *kvisleis og alle dialektformene og arkaismane ville vore usannsynlege å få i ei maskinomsett utgåve, så då lurte eg på kor mykje betre/verre det hadde blitt om eksaminatoren rett og slett hadde brukt Apertium i staden? Ingeborg Steine var så hjelpsam at ho la ut bokmålsutgåva, så då får me prøva 🙂


Ingen kvisleis og fritt for tær og fyr, men det er heller ikkje perfekt: Visse ord manglar frå ordbøkene og får dermed feil bøying, teller blir tolka som substantiv, ein anna maskin har feil bøying på førsteordet (det mangla ein regel der) og at blir ein stad tolka som adverb (som fører til det forunderlege fragmentet det verta at anteke tilvarande). I tillegg blir språket gjenkjent som tatarisk av nettsida, så det var kanskje litt tung norsk? 🙂 Men desse feila er ikkje spesielt vanskelege å retta på – utviklingsutgåva av Apertium gir no:


Det er enno eit par småting som kunne vore retta, men det er allereie betre enn dei fleste eksamenane eg fekk utdelt ved UiO …

by unhammer atJune 01, 2016 09:45 AM

October 18, 2015

Anders Nordby

Fighting spam with SpamAssassin, procmail and greylisting

On my private server we use a number of measures to stop and prevent spam from arriving in the users inboxes: - postgrey (greylisting) to delay arrival (hopefully block lists will be up to date in time to stop unwanted mail, also some senders do not retry) - SpamAssasin to block mails by scoring different aspects of the emails. Newer versions of it has URIBL (domain based, for links in the emails) in addtition to the tradional RBL (IP based) block lists. Which works better. I also created my own URIBL block list which you can use, dbl.fupp.net. - Procmail. For user on my server, I recommend this procmail rule: :0 * ^X-Spam-Status: Yes .crapbox/ It will sort emails that has a score indicating it is spam into mailbox "crapbox". - blocking unwanted and dangerous attachments, particularly for Windows users.

by Anders (noreply@blogger.com) atOctober 18, 2015 01:09 PM

April 23, 2015

Kevin Brubeck Unhammer


I førre innlegg i denne serien gjekk eg kort gjennom ymse metodar for å generera omsetjingskandidatar til tospråklege ordbøker; i dette innlegget skal eg gå litt meir inn på kandidatgenerering ved omsetjing av enkeltdelane av samansette ord. Me har som nemnt allereie ei ordbok mellom bokmål og nordsamisk, som me vil utvida til bokmål–lulesamisk og bokmål–sørsamisk. Og ordboka blei utvikla for å omsetja typisk «departementsspråk», så ho er full av lange, samansette ord. Og på samisk kan me setja saman ord omtrent på same måte som på norsk (i tillegg til ein haug med andre måtar, men det hoppar me glatt over for no). Dette bør me kunna utnytta, sånn at viss me veit kva «klage» er på lulesamisk, og me veit kva «frist» er, så har me iallfall éin fornuftig hypotese for kva «klagefrist» kan vera på lulesamisk 🙂

Orddeling er flott når du skal omsetja ordbøker. Særskrivingsfeil er flott når du vil smila litt.
«Ananássasuorma» jali «ananássa riŋŋgu»? Ij le buorre diehtet.

Altså kan me bruka dei få omsetjingane me allereie har mellom bokmål og lulesamisk/sørsamisk til å laga fleire omsetjingar, ved å omsetja deler av ord, og så setja dei saman igjen. Me har òg eit par omsetjingar liggande mellom nordsamisk og lulesamisk/sørsamisk, så me kan bruka same metoden der (og utnytta det at me har ei bokmål–nordsamisk-ordbok til å slutta riŋgen tilbake til bokmål).

Dekning og presisjon

Dessverre (i denne samanhengen) har me òg ofte fleire omsetjingar av kvart ord; i dei eksisterande bokmål–lulesamisk-ordbøkene me ser på (i stor grad basert på ordboka til Anders Kintel) står det at «klage» kan vera mellom anna gujdalvis, gujddim, luodjom eller kritihkka, medan «frist» kan vera  ájggemierre, giehtadaláduvvat, mierreduvvam eller ájggemærráj. Viss me tillet kvar venstredel å gå med kvar høgredel, får me 16 moglege kandidatar for dette eine ordet! Sannsynlegvis er ikkje meir enn ein eller to av dei brukande (og kanskje ikkje det ein gong). I snitt får me rundt dobbelt så mange kandidatar som kjeldeord med denne metoden. Så me bør finna metodar for å kutta ned på dårlege kandidatar.

Den komplementære utfordringa er å få god nok dekning. Av og til ser me at me ikkje har ei omsetjing av delane av ordet, sjølv om me har omsetjingar av ord med dei same delene i seg. Den setninga krev nok eit døme 🙂 Me vil gjerne ha ein kandidat for ordet «øyekatarr» på lulesamisk, altså samansetjinga «øye+katarr». Me har kanskje ei omsetjing for «øye» i materialet vårt, men ingenting for «katarr». Derimot står det at «blærekatarr» er gådtjåráhkkovuolssje. Så for å utvida dekninga, kan me i tillegg dela opp kjeldematerialet vårt i alle par av samansetjingsdelar; viss me veit at desse orda kan analyserast som «blære+katarr» og gådtjåráhkko+vuolssje, så kan det jo synast som at «blære» er gådtjåráhkko og «katarr» er vuolssje (og Giellatekno har heldigvis gode morfologiske analysatorar som fint deler opp slike ord på rette staden). Og dette gir ei god utviding av materialet – faktisk får me kandidatar for nesten dobbelt så mange av dei orda som me ønsker kandidatar for, viss me utvidar kjeldematerialet på denne måten. Men det har ei stor ulempe òg: Me får over dobbelt så mange lule-/sørsamiske kandidatar per bokmålsord (i snitt rundt fire kandidatar per kjeldeord).

Filtrering og rangering

Me vil innskrenka dei moglege kandidatane til dei som mest sannsynleg er gode. Den beste testen er å sjå om kandidaten finst i korpus, og då helst i same parallellstilte setning (dette er oftast ein bra kandidat). Viss ikkje, så kan me òg sjå på om kandidaten og kjeldeordet har liknande frekvensar, eller om kandidaten har frekvens i det heile.

Orddelingsomsetjinga foreslo tsavtshvierhtie for «virkemiddel», og der stod dei i ein parallellsetning òg:
<s xml:lang="sma" id="2060"/>Daesnie FoU akte vihkeles tsavtshvierhtie .
<s xml:lang="nob" id="2060"/>Her er FoU er et viktig virkemiddel .

– då er det nok eit godt ordpar.

Uheldigvis har me så lite tekstgrunnlag for lule-/sørsamisk at me fort går tom for kandidatar med frekvens i det heile. For sørsamisk har me t.d. berre kandidatar med korpustreff for rundt 10 % av orda me lagar kandidatar for.

Ein annan test, som fungerer på alle ord, er å sjå om det får analyse av dei morfologiske analysatorane våre; viss ikkje (og viss det i tillegg ikkje har korpustreff) er det oftast feil. Men dette fjernar berre rundt 1/4 av kandidatane; med den oppdelte ordboka vår (kor me òg har med par av delar av ord) har me enno i snitt rundt tre kandidatar per kjeldeord.

(Ein test som eg prøvde, men avslo, var filtrering basert på liknande ordlengd. Det verkar jo logisk at lange ord blir omsett til lange og korte til korte, men det finst mange gode unntak. I tillegg fjernar det alt for få dårlege kandidatar til at det ser ut til å vera verdt det.)

Det parallelle korpusmaterialet vårt er altfor lite, men når me skal generera kandidatar til ordbøker så er det jo ikkje parallelle setningar me prøver å predikera, men parallelle ord og ordbokspar. Og då er jo læringsgrunnlaget vårt eigentleg dei eksisterande ordbøkene våre … Derfor prøvde eg å sjå på kva for samansetjingsdelar som faktisk var brukt i dei tidlegare omsetjingane våre, og kva for par av delar som ofte opptredde i tidlegare omsetjingar, og kva for delar som sjeldan eller aldri gjorde det. Til dømes har den oppdelte ordboka vår for bokmål–lulesamisk desse para:

Her ser me at «løyve» anten kan vera loahpádus eller doajmmaloahpe – skal «taxiløyve» då vera táksiloahpádus eller táksidoajmmaloahpe? På bakgrunn av dette materialet bør me nok satsa på det første – sjølv om doajmmaloahpe står oppført, så er det berre loahpádus som opptrer i samansette ord.

Då kan me prøva å generera kandidatar for alle bokmålsorda i materialet vårt, både dei me eigentleg er ute etter å finna kandidatar for, og dei me allereie har omsetjingar for. Gå så gjennom dei genererte kandidatane for dei orda me allereie har omsetjingar for, og tel opp dei para av orddelar som genererte slike ord. Me har kanskje laga kandidatane barggo+loahpádus og barggo+dajmmaloahpe for «arbeids+løyve»; når me så går gjennom dei eksisterande omsetjingane og finn at «arbeidsløyve» stod i ordboka med omsetjinga barggoloahpádus, så aukar me frekvensen til paret «løyve»–loahpádus med éin, medan «løyve»–dajmmaloahpe blir verande null.

For no har berre filtrert ut dei kandidatane kor paret til anten første- eller andreledd hadde nullfrekvens. I følgje litt manuell evaluering frå ein lingvist er det omtrent berre dårlege ord som blir kasta ut, så det filteret ser ut til å fungera bra. På den andre sida blir berre rundt 10 % av kandidatane fjerna viss me berre hiv ut dei med nullfrekvens, så neste steg blir å bruka frekvensane til å få ei full rangering.

Viss alle ord kunne delast i nøyaktig to delar, så ville det kanskje vore nok å telja opp par av delar og enkeltdelar for å estimera sannsyn, altså f(s,t)/f(s).  Men av og til kan ord delast på fleire måtar, til dømes kan me sjå på «sommersiidastyre» som «sommer+siidastyre» eller «sommersiida+styre» (eg har valt å halda meg til todelingar av ord, for å unngå for mange alternative kandidatar). Viss omsetjinga er giessesijddastivrra, med analysane giesse+sijddastivrra eller giessesijdda+stivrra, så har me ikkje utan vidare nokon grunn til å velja den eine over den andre (vel, me har lengd i dette tilfellet, men det gjeld ikkje i alle slike døme, og me kan ha par av analysar som er 2–3 eller 3–2). Då kan me heller ikkje seia kva for par av orddelar (s,t) me skal auka når me ser «sommersiidastyre»–giessesijddastivrra i treningsmaterialet. Men viss me i tillegg ser «styre»–stivvra ein annan stad, så har me plutseleg eit grunnlag til å ta ei avgjerd. Metodar som Expectation Maximization kan kombinera relaterte frekvensar på denne måten for å finna fram til gode estimat, men eg har ikkje komme så langt at eg har fått implementert dette enno.

by unhammer atApril 23, 2015 06:11 PM

January 06, 2015


NSA-proof SSH

ssh-pictureOne of the biggest takeaways from 31C3 and the most recent Snowden-leaked NSA documents is that a lot of SSH stuff is .. broken.

I’m not surprised, but then again I never am when it comes to this paranoia stuff. However, I do run a ton of SSH in production and know a lot of people that do. Are we all fucked? Well, almost, but not really.

Unfortunately most of what Stribika writes about the “Secure Secure Shell” doesn’t work for old production versions of SSH. The cliff notes for us real-world people, who will realistically be running SSH 5.9p1 for years is hidden in the bettercrypto.org repo.

Edit your /etc/ssh/sshd_config:

Ciphers aes256-ctr,aes192-ctr,aes128-ctr
MACs hmac-sha2-512,hmac-sha2-256,hmac-ripemd160
KexAlgorithms diffie-hellman-group-exchange-sha256

Basically the nice and forward secure aes-*-gcm chacha20-poly1305 ciphers, the curve25519-sha256 Kex algorithm and Encrypt-Then-MAC message authentication modes are not available to those of us stuck in the early 2000s. That’s right, provably NSA-proof stuff not supported. Upgrading at this point makes sense.

Still, we can harden SSH, so go into /etc/ssh/moduli and delete all the moduli that have 5th column < 2048, and disable ECDSA host keys:

cd /etc/ssh
mkdir -p broken
mv moduli ssh_host_dsa_key* ssh_host_ecdsa_key* ssh_host_key* broken
awk '{ if ($5 > 2048){ print } }' broken/moduli > moduli
# create broken links to force SSH not to regenerate broken keys
ln -s ssh_host_ecdsa_key ssh_host_ecdsa_key
ln -s ssh_host_dsa_key ssh_host_dsa_key
ln -s ssh_host_key ssh_host_key

Your clients, which hopefully have more recent versions of SSH, could have the following settings in /etc/ssh/ssh_config or .ssh/config:

Host all-old-servers

    Ciphers aes256-gcm@openssh.com,aes128-gcm@openssh.com,chacha20-poly1305@openssh.com,aes256-ctr,aes192-ctr,aes128-ctr
    MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-ripemd160-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-512,hmac-ripemd160
    KexAlgorithms curve25519-sha256@libssh.org,diffie-hellman-group-exchange-sha256

Note: Sadly, the -ctr ciphers do not provide forward security and hmac-ripemd160 isn’t the strongest MAC. But if you disable these, there are plenty of places you won’t be able to connect to. Upgrade your servers to get rid of these poor auth methods!

Handily, I have made a little script to do all this and more, which you can find in my Gone distribution.

There, done.

sshh obama

Updated Jan 6th to highlight the problems of not upgrading SSH.
Updated Jan 22nd to note CTR mode isn’t any worse.
Go learn about COMSEC if you didn’t get trolled by the title.

by kacper atJanuary 06, 2015 04:33 PM

December 08, 2014


sound sound


Recently I been doing some video editing.. less editing than tweaking my system tho.
If you want your jack output to speak with Kdenlive, a most excellent video editing suite,
and output audio in a nice way without choppyness and popping, which I promise you is not nice,
you’ll want to pipe it through pulseaudio because the alsa to jack stuff doesn’t do well with phonom, at least not on this convoluted setup.

Remember, to get that setup to work, ALSA pipes to jack with the pcm.jack { type jack .. thing, and remove the alsa to pulseaudio stupidity at /usr/share/alsa/alsa.conf.d/50-pulseaudio.conf

So, once that’s in place, it won’t play even though Pulse found your Jack because your clients are defaulting out on some ALSA device… this is when you change /etc/pulse/client.conf and set default-sink = jack_out.

by kacper atDecember 08, 2014 12:18 AM

February 24, 2013

Bjørn Venn

Chromebook; a real cloud computer – but will it work in the clouds?

Lyst på én? Den er ikke i salg i Norge enda, men du kan kjøpe den på Amazon. Les her hvordan jeg kjøpte min på Amazon (bla litt nedover på siden). Med norsk moms, levert til Rimi-butikken 100 meter fra der jeg bor, kom den på 1.850 kroner. Det er den så absolutt verdt:)

by Bjorn Venn atFebruary 24, 2013 07:34 PM

February 22, 2013

Bjørn Venn

Hvem klarer å skaffe meg en slik før påske?

Chromebook pixel

Den nye Chromebook-en til Google, Chromebook Pixel. Foreløbig kun i salg i USA og UK via Google Play og BestBuy.

Verden er urettferdig:)

by Bjorn Venn atFebruary 22, 2013 12:44 PM

October 31, 2011

Anders Nordby

Taile wtmp-logg i 64-bit Linux med Perl?

Jeg liker å la ting skje hendelsesbasert, og har i den forbindelse lagd et script for å rsynce innhold etter opplasting med FTP. Jeg tailer da wtmp-loggen med Perl, og starter sync når brukeren er eller har blitt logget ut (kort idle timeout). Å taile wtmp i FreeBSD var noe jeg for lenge siden fant et fungerende eksempel på nettet:
$typedef = 'A8 A16 A16 L'; $sizeof = length pack($typedef, () ); while ( read(WTMP, $buffer, $sizeof) == $sizeof ) { ($line, $user, $host, $time) = unpack($typedef, $buffer); # Gjør hva du vil med disse verdiene her }
FreeBSD bruker altså bare verdiene line (ut_line), user (ut_name), host (ut_host) og time (ut_time), jfr. utmp.h. Linux (x64, hvem bryr seg om 32-bit?) derimot, lagrer en hel del mer i wtmp-loggen, og etter en del Googling, prøving/feiling og kikking i bits/utmp.h kom jeg frem til:
$typedef = "s x2 i A32 A4 A32 A256 s2 l i2 i4 A20"; $sizeof = length pack($typedef, () ); while ( read(WTMP, $buffer, $sizeof) == $sizeof ) { ($type, $pid, $line, $id, $user, $host, $term, $exit, $session, $sec, $usec, $addr, $unused) = unpack($typedef, $buffer); # Gjør hva du vil med disse verdiene her }
Som bare funker, flott altså. Da ser jeg i sanntid brukere som logger på og av, og kan ta handlinger basert på dette.

by Anders (noreply@blogger.com) atOctober 31, 2011 07:37 PM

