Førstesiden Bli medlem Kontakt Informasjon Medlemsfordeler Utvalg Kalender NUUG/HIO prisen Dokumenter Innmelding Ressurser Mailinglister Wiki Linker Om de aktive Kart NUUG i media Planet NUUG webmaster@nuug.no
Powered by Planet! Last updated: April 23, 2019 05:45 AM

Planet NUUG

April 05, 2019

Ole Aamot GNOME Development Blog

GNOME Internet Radio Locator 2.0.2 for Fedora Core 30

Fedora 30 RPM packages for version 2.0.2 of GNOME Internet Radio Locator are now available:




by oleaamot atApril 05, 2019 07:48 PM

March 25, 2019

Petter Reinholdtsen

PlantUML for text based UML diagram modelling - nice free software

As part of my involvement with the Nikita Noark 5 core project, I have been proposing improvements to the API specification created by The National Archives of Norway and helped migrating the text from a version control system unfriendly binary format (docx) to Markdown in git. Combined with the migration to a public git repository (on github), this has made it possible for anyone to suggest improvement to the text.

The specification is filled with UML diagrams. I believe the original diagrams were modelled using Sparx Systems Enterprise Architect, and exported as EMF files for import into docx. This approach make it very hard to track changes using a version control system. To improve the situation I have been looking for a good text based UML format with associated command line free software tools on Linux and Windows, to allow anyone to send in corrections to the UML diagrams in the specification. The tool must be text based to work with git, and command line to be able to run it automatically to generate the diagram images. Finally, it must be free software to allow anyone, even those that can not accept a non-free software license, to contribute.

I did not know much about free software UML modelling tools when I started. I have used dia and inkscape for simple modelling in the past, but neither are available on Windows, as far as I could tell. I came across a nice list of text mode uml tools, and tested out a few of the tools listed there. The PlantUML tool seemed most promising. After verifying that the packages is available in Debian and found its Java source under a GPL license on github, I set out to test if it could represent the diagrams we needed, ie the ones currently in the Noark 5 Tjenestegrensesnitt specification. I am happy to report that it could represent them, even thought it have a few warts here and there.

After a few days of modelling I completed the task this weekend. A temporary link to the complete set of diagrams (original and from PlantUML) is available in the github issue discussing the need for a text based UML format, but please note I lack a sensible tool to convert EMF files to PNGs, so the "original" rendering is not as good as the original was in the publised PDF.

Here is an example UML diagram, showing the core classes for keeping metadata about archived documents:

skinparam classAttributeIconSize 0

!include media/uml-class-arkivskaper.iuml
!include media/uml-class-arkiv.iuml
!include media/uml-class-klassifikasjonssystem.iuml
!include media/uml-class-klasse.iuml
!include media/uml-class-arkivdel.iuml
!include media/uml-class-mappe.iuml
!include media/uml-class-merknad.iuml
!include media/uml-class-registrering.iuml
!include media/uml-class-basisregistrering.iuml
!include media/uml-class-dokumentbeskrivelse.iuml
!include media/uml-class-dokumentobjekt.iuml
!include media/uml-class-konvertering.iuml
!include media/uml-datatype-elektronisksignatur.iuml

Arkivstruktur.Arkivskaper "+arkivskaper 1..*" <-o "+arkiv 0..*" Arkivstruktur.Arkiv
Arkivstruktur.Arkiv o--> "+underarkiv 0..*" Arkivstruktur.Arkiv
Arkivstruktur.Arkiv "+arkiv 1" o--> "+arkivdel 0..*" Arkivstruktur.Arkivdel
Arkivstruktur.Klassifikasjonssystem "+klassifikasjonssystem [0..1]" <--o "+arkivdel 1..*" Arkivstruktur.Arkivdel
Arkivstruktur.Klassifikasjonssystem "+klassifikasjonssystem [0..1]" o--> "+klasse 0..*" Arkivstruktur.Klasse
Arkivstruktur.Arkivdel "+arkivdel 0..1" o--> "+mappe 0..*" Arkivstruktur.Mappe
Arkivstruktur.Arkivdel "+arkivdel 0..1" o--> "+registrering 0..*" Arkivstruktur.Registrering
Arkivstruktur.Klasse "+klasse 0..1" o--> "+mappe 0..*" Arkivstruktur.Mappe
Arkivstruktur.Klasse "+klasse 0..1" o--> "+registrering 0..*" Arkivstruktur.Registrering
Arkivstruktur.Mappe --> "+undermappe 0..*" Arkivstruktur.Mappe
Arkivstruktur.Mappe "+mappe 0..1" o--> "+registrering 0..*" Arkivstruktur.Registrering
Arkivstruktur.Merknad "+merknad 0..*" <--* Arkivstruktur.Mappe
Arkivstruktur.Merknad "+merknad 0..*" <--* Arkivstruktur.Dokumentbeskrivelse
Arkivstruktur.Basisregistrering -|> Arkivstruktur.Registrering
Arkivstruktur.Merknad "+merknad 0..*" <--* Arkivstruktur.Basisregistrering
Arkivstruktur.Registrering "+registrering 1..*" o--> "+dokumentbeskrivelse 0..*" Arkivstruktur.Dokumentbeskrivelse
Arkivstruktur.Dokumentbeskrivelse "+dokumentbeskrivelse 1" o-> "+dokumentobjekt 0..*" Arkivstruktur.Dokumentobjekt
Arkivstruktur.Dokumentobjekt *-> "+konvertering 0..*" Arkivstruktur.Konvertering
Arkivstruktur.ElektroniskSignatur -[hidden]-> Arkivstruktur.Dokumentobjekt

The format is quite compact, with little redundant information. The text expresses entities and relations, and there is little layout related fluff. One can reuse content by using include files, allowing for consistent naming across several diagrams. The include files can be standalone PlantUML too. Here is the content of media/uml-class-arkivskaper.iuml:

class Arkivstruktur.Arkivskaper  {
  +arkivskaperID : string
  +arkivskaperNavn : string
  +beskrivelse : string [0..1]

This is what the complete diagram for the PlantUML notation above look like:

A cool feature of PlantUML is that the generated PNG files include the entire original source diagram as text. The source (with include statements expanded) can be extracted using for example exiftool. Another cool feature is that parts of the entities can be hidden after inclusion. This allow to use include files with all attributes listed, even for UML diagrams that should not list any attributes.

The diagram also show some of the warts. Some times the layout engine place text labels on top of each other, and some times it place the class boxes too close to each other, not leaving room for the labels on the relationship arrows. The former can be worked around by placing extra newlines in the labes (ie "\n"). I did not do it here to be able to demonstrate the issue. I have not found a good way around the latter, so I normally try to reduce the problem by changing from vertical to horizontal links to improve the layout.

All in all, I am quite happy with PlantUML, and very impressed with how quickly its lead developer responds to questions. So far I got an answer to my questions in a few hours when I send an email. I definitely recommend looking at PlantUML if you need to make UML diagrams. Note, PlantUML can draw a lot more than class relations. Check out the documention for a complete list. :)

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

March 25, 2019 08:35 AM

March 24, 2019

Petter Reinholdtsen

Release 0.3 of free software archive API system Nikita announced

Yesterday, a new release of Nikita Noark 5 core project was announced on the project mailing list. The free software solution is an implementation of the Norwegian archive standard Noark 5 used by government offices in Norway. These were the changes in version 0.3 since version 0.2.1 (from NEWS.md):

If free and open standardized archiving API sound interesting to you, please contact us on IRC (#nikita on irc.freenode.net) or email (nikita-noark mailing list).

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

March 24, 2019 01:30 PM

February 20, 2019

Ole Aamot GNOME Development Blog

GNOME Internet Radio Locator 2.0.0 for Fedora 29

Fedora 29 RPM packages for version 2.0.0 of GNOME Internet Radio Locator are now available:




by oleaamot atFebruary 20, 2019 11:00 PM

January 24, 2019

Peter Hansteen (That Grumpy BSD Guy)

The UK "Porn" Filter Blocks Kids' Access To Tech, Civil Liberties Websites

It fell to the UK Tories to actually implement the Nanny State. Too bad Nanny Tory does not want kinds to read up on tech web sites, or civil liberties ones. Read on for a small sample of what the filter blocks, from a blocked-by-default tech writer.

[Updated 2x, scroll down]

Regular readers (at least those of you who also follow me on twitter) will know that I'm more than a little skeptical of censorship in general. And you may have seen, as evidenced by this tweet that I found the decision to implement a nationwide, on-by-default-but-possible-to-opt-out-of web filtering scheme in the UK to be a seriously stupid idea.

But then I was never very likely to become a UK resident or anything more than a very temporary customer of any UK ISP during visits to the country, so I did not give the matter another thought until today, when this tweet announced that you could indeed check whether your web site was blocked. The tweet points you to http://urlchecker.o2.co.uk/urlcheck.aspx, which appears to be a checking engine for UK ISP O2, which is among the ISPs to implement the blocking regime.

I used that URL checker to find the blocking status of various sites where I'm either part or the content-generating team or sites that I find interesting enough to visit every now and then. The sites appear in the semi-random order that I visited them on December 22, 2013, starting a little after 16:00 CET:

bsdly.net: I checked my own personal web site first, www.bsdly.net. I was a bit surprised to find that it was blocked in the default Parental control regime. Users of the archive.org Internet Wayback Machine may be able to find one page that contained a reference to a picture of "a blonde chick with a cute pussy", but the intrepid searcher will find that the picture in question in fact was of juvenile poultry and felines, respectively. The site is mainly tech content, with some resources such as the hourly updated list of greytrapped spam senders (see eg this blog post for some explanation of that list and its purpose).

nuug.no: Next up I tried the national Norwegian Unix Users' group web site www.nuug.no, with a somewhat odd result - "The URL has not yet been classified. If you would like it to be classified please press Reclassify URL". There was no Reclassify URL option visible in the web interface, but I would assume that in a default to block regime, the site would be blocked anyway. It would be nice to have confirmation of this from actual O2 customers or other people in the UK.

But NUUG hosts a few specific items I care about, such as my NUUG home page with links to slides from my talks and other resources I've produced over the years. Entering http://home.nuug.no and http://home.nuug.no/~peter/pf/ (the path to my PF tutorial material) both produced an "Invalid URL" message. This looks like bug in the URL checker code, but once again it would be nice to have confirmation from persons who are UK residents and/or O2 customers about the blocking status for those URLs.

usenix.org:Next I tried www.usenix.org, the main site for USENIX, the US-based but actually quite international Unix user group. This also turned out to be apparently blocked in the Parental control regime.

ukuug.org and flossuk.org: But if you're a UK resident, your first port of call for finding out about Unix-like systems is likely to be UK Unix User Group instead, so I checked both www.ukuug.org and flossuk.org, and both showed up as blocked in the Parental control regime (ukuug.org, flossuk.org).

So it appears that it's the official line that kids under 12 in the UK should not be taught about free or open source software, according to the default filtering settings.

eff.org: You will have guessed by now that I'm a civil liberties man, so the next site URL I tried was www.eff.org, which was also blocked by the Parental Control regime. So UK kids need protection from learning about civil liberties and privacy online.

amnesty.org.uk: A little closer to home for UK kids, I thought perhaps a thoroughly benign organization such as Amnesty International would somehow be pre-approved. But no go: I tried the UK web site, amnesty.org.uk, and it, to was blocked by the Parental Control regime. UK kids apparently need to be shielded from the sly propaganda of an organization that has worked, among other things for releasing political prisoners and against cruel and unusual punishment such as the death penalty everywhere.

slashdot.org: Next up in my quasi-random sequence was the tech new site slashdot.org, which may at times be informal in tone, but still so popular that I was somewhat surprised to find that it, too was blocked by the Parental Control regime.

linuxtoday.com: Another popular tech news site is linuxtoday.com, with, as the name says, has a free and open source software slant. Like slashdot, this one was also blocked by the Parental Control regime.

bsdly.blogspot.com: Circling back to my own turf, I decided to check the site where I publish the most often, bsdly.blogspot.com. By this time I wasn't terribly surprised to find that my writing too has fallen afould of something or other and is by default blocked by the Parental Control regime.

nostarch.com: Blocking an individual writer most people probably haven't heard about in a default to block regime isn't very surprising, but would they not at least pre-approve well known publishers? I tried nostarch.com (home of among others a series of LEGO-themed tech/science books for kids as well as Manga guides to various sciences, as well as various BSD and Linux books). No matter, they too were blocked by the Parental Control regime.

blogspot.com: Along the same lines as in the nostarch.com case, if they default to block they may well have an unknown scribe blocked, but would they block an entire blogging site's domain? So I tried blogspot.com. The result is that it's apparently registered that the site has "dynamic content" so even the "default safety" settings may end up blocking. But of course, another one that's blocked by the Parental Control regime.

arstechnica.com: I still couldn't see any clear logic besides a probable default to block, so I tried another popular tech news site, arstechnica.com. I was a bit annoyed, but not too surprised that this too was blocked by the Parental Control regime.

The last four I tried mainly to get confirmation of what I already suspected:

www.openbsd.org: What could possibly be offensive or subversive about the most secure free operating system's website? I don't know, but the site is apparently too risky for minors, blocked by the Parental Control regime as it is.

undeadly.org: The site undeadly.org is possibly marginally better known under the name OpenBSD Journal. It exists to collect and publish news relevant to the OpenBSD operating system, its developers and users. For Nanny only knows what reason, this site was also blocked by the Parental Control regime.

www.freebsd.org: www.freebsd.org is the home site of FreeBSD, another fairly popular free BSD operating system (which among others Apple has found useful as a source of code that works better in a public maintenance regime). I thought perhaps the incrementally larger community size would have put this site on Nanny's horizon, but apparently not: FreeBSD.org remains blocked by the Parental Control regime.

www.geekculture.com: How about a little geek humor, then? www.geekculture.com is home to several web comics, and The Joy of Tech remains a favorite, even with the marked Apple slant. But apparently that too, is too much for the children of the United Kingdom: Geekculture.com is blocked by the Parental Control regime.

www.linux.com: And finally, the penguins: By now it should not surprise anyone that www.linux.com, a common starting point for anyone looking for information about that operating system, like the others is blocked by the Parental Control regime.

So summing up, checking a semi-random collection of mainly fairly mainstream and some rather obscure tech URLs shows that far from focusing on its stated main objective, keeping innocent children away from online porn, the UK Internet filter shuts the UK's children out of a number of valuable IT resources, was well as several important civil liberties resources.

And if this is the true face of Parental Controls, I for one would take using controls like these as a sufficient indicator that the parents in question are in fact not qualified to do their parenting without proper supervision.

If this is an indicator of how the collective of United Kingdom Internet Nannies is to maintain their filtering regime, they are most certainly part of a bigger problem than the one they claim to be working to solve.

If you are a UK resident or other victim of automated censorship, I would like to hear from you. Please submit your story in comments or send me an email at blockage@bsdly.net Unfortunately only a small number of useful responses turned up, immediately after the article was published. Please do not try contacting me via that address, please use other easily available methods. On 2019-01-24, I decided it had earned its place among the spamtraps.

Update 2013-12-23 13:05 CET: A reader alerted me to the fact that the URL Checker is down, and that URL now leads to a page that claims the operators are "in the process of reviewing and updating" their offerings.

Update 2013-12-24 19:30 CET: O2 contacted me via twitter direct message, pointing me to their FAQ at http://news.o2.co.uk/2013/12/24/parental-control-questions-answered/. As non-responsive responses go, it was fairly useful, if not entirely constructive. The most useful bit of information is possibly that the service as presented is apparently specific to O2 customers, not the frequently cited national, Tory-backed regime.

As the FAQ document clearly demonstrates, the underlying problem is that some of their customers, for whatever reason, have chosen to leave the monitoring and mentoring of their children's reading to an automated service.

The world contains a multitude of dangers, and most of us, in the UK or elsewhere, would agree that it is a parent's duty both to protect their offspring and to educate them in how to avoid danger or handle problems they encounter.

Ignorance has yet to help anyone solve a problem

There are several ways to protect and educate, and I feel that the approach offered by O2's service is the wrong approach in several important ways. First off, by limiting children's access to information, it strongly recommends choosing ignorance instead of education as the main defense against the perceived evils of the world.

If a person would advise that you chain your children to the wall and burn their library cards, you as a responsible parent would perhaps be reluctant to accept that advice as valid. But O2 has no qualms about offering a commercial service that does just that, only via digital means.

But the engineer in me also compels me to point out that the "Parental Control" is designed only to attack a specific symptom of a wider problem, and it fails to address that problem. And making matters slightly worse, the proposed solution is to apply a technical solution to a human or perhaps social problem.

The real problem is that some number of parents do not feel up to the task of mentoring and educating their children in safe and sensible use of their gadgets and the information that is accessible through the gadgets. Parents failing, or perceiving that they may be failing, to adequately educate or mentor their children is the real problem here. Fix that problem, and your symptoms go away.

If a significant subset of O2's customers feel they are unable to handle their parenting duties, the problem may very well be that society is failing to adequately support parents' needs during their child-rearing years.

The solution may well be political, and may very well involve matters that are best resolved by making a proper choice at the ballot box after well reasoned debates. In the meantime, O2 is only making matters worse by answering the needs of persons who feel the symptoms of the deeper problem by catering to a perhaps understandable, but in fact utterly counterproductive, drive for ignorance.

Ignorance never helped anybody solve a problem. Children need to be nurtured, educated, mentored and stimulated to explore. Please do not force them into ignorance instead.

by Peter N. M. Hansteen (noreply@blogger.com) atJanuary 24, 2019 04:31 PM

January 16, 2019

NUUG news

En dag i lagmannsretten om DNS-domene-inndragning

Etter en lengre tids venting, og en dags utsettelse, ble det på onsdag 2018-01-09 endelig tid for ankeforhandling i Borgarting lagmannsrett og sak 18-027486AST-BORG/02.

En kort oppsummering for de som ikke er kjent med denne saken innebærer at Økokrim mottok en anmeldelse fra rettighetsalliansen som tilsa at hjemmesiden popcorn-time.no brøt norsk lov. Økokrim, som en del av et etterforskningssteg, tok beslag i domenet under mistanke om medvirkning til brudd på åndsverkloven. EFN og NUUG reagerte først og fremst på denne endringen av praksis rundt beslag av domenenavn, hvor man tidligere kun har tatt beslag i domenenavn etter avgjørelse fra en dommer. Så vidt kjent av undertegnede er dette første gangen et domene er beslaglagt på denne måten.

Etter en kort formell oppstart var det først Økokrims politiadvokat, Maria Bache Dahl, som skulle holde innledningsforedrag. Økokrim startet med å forklare at de ikke anså det som relevant, og at retten derfor ikke trengte å ta hensyn til, om bittorrent eller popcorn-time avspilling var lovlig eller ikke.

Siden det i beslaget er brukt medvirkningsparagrafen, så må det finnes en hovedgjerning som det medvirkes til. Dette kalles en hovedgjerning. Økokrim har flere ganger spesifisert at de mener hovedgjerningen er "opprinnelig opplastning av kildefiler av åndsverk til internett uten samtykke fra rettighetshaver".

Økokrim brukte så mye tid på høylesning fra den aktuelle hjemmesiden. Hjemmesiden hadde flere henvisninger til at bruk av en Popcorn-time avspiller kunne være brudd på lokale lover og oppfordret leseren til å sjekke dette på egenhånd. Det var også henvisninger til VPN-bruk, slik at man kunne skjule sin identitet.

Ut fra samtaler undertegnede hadde med de som deltok i tingretten hadde Økokrim moderert seg vesentlig i denne rettsrunden. Det som ble lagt frem var nøkternt og relativt rett frem. Det var konkrete henvisninger til nettsidens innhold, og lite annet. Til forskjell fra forrige rettsrunde virket det som om virkelighetsforståelsen nå var mer lik mellom rettens parter.

Etter Økokrims innledningsforedrag var det tid for forsvarers bemerkninger, av Ola Tellesbø. Forsvarer beskrev hjemmesiden som lite mer enn en enkel blogg med fakta og nyheter. Hjemmesiden hadde faktisk ingen lenker til ulovlig innhold. Forsvarer gjorde også et poeng ut av at hjemmesiden tilsynelatende var lite populær og besøkt også. Den hadde en teller med facebook-likes, og brukte Google Analytics.

Forsvarer brukte så vesentlig tid på å vise til flere akademiske verk som forsøker å belyse hvor mye rettighetshaver egentlig taper på piratvirksomhet. Og kunne vise at det ikke vare særlig mye en rettighetshaver taper på piratvirksomhet. Undertegnede regner med at dette har sammenheng med straffeverdigheten i saken, og at beslag skal være forholdsmessig sett i sammenheng med alvorligheten av lovbruddet.

Forsvarer viste også til ytringsfriheten, og at hjemmesiden ikke oppfordret til noe lovbrudd. Tvert imot inneholdt hjemmesiden advarsler om å ikke begå lovbrudd. Det vil være en litt merkelig avgjørelse dersom retten mener advarsler om å ikke begå lovbrudd egentlig er tilståelser om at lovbrudd blir utført eller egentlig en oppfordring til lovbrudd.

Etter forsvarers bemerkninger var det duket for merknader fra partshjelper EFN. De snakket om menneskerettigheter og digitale rettigheter. Med digitale rettigheter mener det menneskerettighetene, men i den digitale spheren. Slik som ytringsfriheten i denne saken.

Etter EFN var det partshjelper NUUGs merknader. NUUG hadde en illustrasjon de brukte, for å vise sammenhengen mellom domenet popcorn-time.no og hovedgjerningen slik den er definert av Økokrim. Den viste en gjerningmann, som da utførte en initiell opplastning av åndsverk til internet uten samtykke, også viste den en vilkårlig person som gjennom media blir bevisst på Popcorn-Time sin eksistens og gjennom nett-søk kanskje finner frem til popcorn-time.no hjemmesiden. Også skal dette da på en måte være medvirkning til hovedgjerningen.

Det ble også tid til utspørring av et par vitner i saken. Henholdsvis Morten Christoffersen som representerte IMCASREG8, som kunne fortelle litt om hvordan de jobbet med registrering av domenenavn, og EFNs partsrepresentant Tom Fredrik Blenning som kunne fortelle retten om hvorfor EFN engasjerer seg i saken.

Tom Fredrik refererte til egen utdanning hvor han, som kjemistudent, i sin eksamenbesvarelse ble bedt om å vise teorien for å syntisere GHB. Han snakket om at ytringsfriheten tillater Mein Kampf å bli publisert, og Breiviks manifest, og kunne så vise retten at EFN hadde publisert boken "En ulovlig bok?" hvor de har inkludert alle sidene som var på popcorn-time.no. Han forklarte at enten så måtte retten frikjenne i denne saken, eller så må boken også være medvirkning til brudd på åndsverkloven. EFN mener ytringsfriheten gjelder på nett lik som den gjelder for den analoge verden. Dette fikk frem et lite smil fra rettens administrator, som tilsynelatende ga et blikk til Økokrim, men det virket som om Økokrim satt opptatt med tankene andre steder mens dette foregikk.

Med dette ble første dag avsluttet. Undertegnede hadde ikke anledning til å være tilstede de to påfølgende dagene. Det ble ført flere vitner, bl.a. fra MPAA, rettighetsalliansen, o.l. Blant noen poeng som kom frem, var at hjemmesiden brukte Google Adwords for reklame, men tilsynelatende forsøkte aldri (ihvertfall har de ikke nevnt noe slikt for retten) Økokrim å følge det sporet for å finne en bakmann som til slutt endte opp med reklameinntekter. Retten fikk også en kopi av "En ulovlig bok?".

Vi gleder oss til å høre rettens avgjørelse ca 25. januar.

January 16, 2019 06:30 PM

December 02, 2018

NUUG Foundation

Reisestipend - 2019

NUUG Foundation utlyser reisestipender for 2019. Søknader kan sendes inn til enhver tid.

December 02, 2018 04:10 PM

November 14, 2018

Dag-Erling Smørgrav


Time for my annual “oh shit, I forgot to bump the copyright year again” round-up!

In the F/OSS community, there are two different philosophies when it comes to applying copyright statements to a project. If the code base consists exclusively (or almost exclusively) of code developed for that specific project by the project’s author or co-authors, many projects will have a single file (usually named LICENSE) containing the license, a list of copyright holders, and the copyright dates or ranges. However, if the code base incorporates a significant body of code taken from other projects or contributed by parties outside the project, it is customary to include the copyright statements and either the complete license or a reference to it in each individual file. In my experience, projects that use the BSD, ISC, MIT, adjacent licenses tend to use the latter model regardless.

The advantage of the second model is that it’s hard to get wrong. You might forget to add a name to a central list, but you’re far less likely to forget to state the name of the author when you add a new file. The disadvantage is that it’s really, really easy to forget to update the copyright year when you commit a change to an existing file that hasn’t been touched in a while.

So, how can we automate this?

One possibility is to have a pre-commit hook that updates it for you (generally a bad idea), or one that rejects the commit if it thinks you forgot (better, but not perfect; what if you’re adding a file from an outside source?), or one that prints a big fat warning if it thinks you forgot (much better, especially with Git since you can commit --amend once you’ve fixed it, before pushing).

But how do you fix the mistake retroactively, without poring over commit logs to figure out what was modified when?

Let’s start by assuming that you have a list of files that were modified in 2017, and that each file only has one copyright statement that needs to be updated to reflect that fact. The following Perl one-liner should do the trick:

perl -p -i -e 'if (/Copyright/) { s/ ([0-9]{4})-20(?:0[0-9]|1[0-6]) / $1-2017 /; s/ (20(?:0[0-9]|1[0-6])) / $1-2017 /; }'

It should be fairly self-explanatory if you know regular expressions. The first substitution handles the case where the existing statement contains a range, in which case we extend it to include 2017, and the second substitution handles the case where the existing statement contains a single year, which we replace with a range starting with the original year and ending with 2017. The complexity stems mostly from having to take care not to replace 2018 (or later) with 2017; our regexes only match years in the range 2000-2016.

OK, so now we know how to fix the files, but how do we figure out which ones need fixing?

With Git, we could try something like this:

git diff --name-only 'HEAD@{2017-01-01}..HEAD@{2018-01-01}'

This is… imperfect, though. The first problem is that it will list every file that was touched, including files that were added, moved, renamed, or deleted. Files that were added should be assumed to have had a correct copyright statement at the time they were added; files that were only moved or renamed should not be updated, since their contents did not change; and files that were deleted are no longer there to be updated.¹ We should restrict our search to files that were actually modified:

git diff --name-only --diff-filter M 'HEAD@{2017-01-01}..HEAD@{2018-01-01}'

Some of those changes might be too trivial to copyright, though. This is a fairly complex legal matter, but to simplify, if the change was inevitable and there was no room for creative expression — for instance, a function in a third-party library you are using was renamed, so both the reason for the change and the nature of the change are external to the work itself — then it is not protected. So perhaps you should remove --name-only and review the diff, which is when you realize that half those files were only modified to update their copyright statements because you forgot to do so in 2016. Let’s try to exclude them mechanically, rather than manually. Unfortunately, git diff does not have anything that resembles diff -I, so we have to write our own diff command which does that, and ask git to use it:

$ echo 'exec diff -u -ICopyright "$@"' >diff-no-copyright
$ chmod a+rx diff-no-copyright
$ git difftool --diff-filter M --extcmd $PWD/diff-no-copyright 'HEAD@{2017-01-01}..HEAD@{2018-01-01}'

This gives us a diff, though, not a list of files. We can try to extract the names as follows:

$ git difftool --diff-filter M --no-prompt --extcmd $PWD/diff-no-copyright 'HEAD@{2017-01-01}..HEAD@{2018-01-01}' | awk '/^---/ { print $2 }'

Wait… no, that’s just garbage. The thing is, git difftool works by checking out both versions of a file and diffing them, so what we get is a list of the names of the temporary files it created. We have to be a little more creative:

$ echo '/usr/bin/diff -q -ICopyright "$@" >/dev/null || echo "$BASE"' >list-no-copyright
$ chmod a+rx list-no-copyright
$ git difftool --diff-filter M --no-prompt --extcmd $PWD/list-no-copyright 'HEAD@{2017-01-01}..HEAD@{2018-01-01}'

Much better. We can glue this together with our Perl one-liner using xargs, then repeat the process for 2018.

Finally, how about Subversion? On the one hand, Subversion is far simpler than Git, so we can get 90% of the way much more easily. On the other hand, Subversion is far less flexible than Git, so we can’t go the last 10% of the way. Here’s the best I could do:

$ echo 'exec diff -u -ICopyright "$@"' >diff-no-copyright
$ chmod a+rx diff-no-copyright
$ svn diff --ignore-properties --diff-cmd $PWD/diff-no-copyright -r'{2017-01-01}:{2018-01-01}' | awk '/^---/ { print $2 }'

This will not work properly if you have files with names that contain whitespace; you’ll have to use sed with a much more complicated regex, which I leave as an exercise.

¹ I will leave the issue of move-and-modify being incorrectly recorded as delete-and-add to the reader. One possibility is to include added files in the list by using --diff-filter AM, and review them manually before committing.

by Dag-Erling Smørgrav atNovember 14, 2018 07:11 PM

November 11, 2018

Peter Hansteen (That Grumpy BSD Guy)

Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting

SMTP email is not going away any time soon. If you run a mail service, when and to whom you present the code signifying a temporary local problem code is well worth your attention.

SMTP email is everywhere and is used by everyone.

If you are a returning reader, there is a higher probability that you run a mail service yourself than in the general population.

This in turn means that you will be aware that one of the rather annoying oversights of the original and still-current specifications of the SMTP based mail system is that while it's straightforward to announce which systems are supposed to receive mail for a domain, specifying which hosts would be valid email senders was not part or the original specification at all.

Any functioning domain MUST have at least one MX (mail exchanger) record published via the domain name system, and registrars will generally not even let you register a domain unless you have set up somewhere to receive mail for the domain.

But email worked most of the time anyway, and while you would occasionally hear about valid mail not getting delivered, it was a rarer occurrence than you might think.

Then a few years along, the Internet grew out of the pure research arena and became commercial, and spam started happening. Even in the early days of spam it seems that a significant subset of the messages, possibly even the majority, was sent with faked sender addresses in domains not connected to the actual senders.

Over time people have tried a number of approaches to the problems involved in getting rid of unwanted commercial and/or malware carrying email. If you are interested in a deeper dive into the subject, you could jump over to my earlier piece Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools.

Two very different methods of reducing spam traffic were originally formulated at roughly the same time, and each method's adherents are still duking it out over which approach is the better one.

One method consists simply of implementing a strict interpretation of a requirement that was already formulated in the SMTP RFC at the time.

The other is a complicated extension of the SMTP-relevant data that is published via DNS, and full implementation would require reconfiguration of every SMTP email system in the world.

As you might have guessed, the first is what is commonly referred to as greylisting, where we point to the RFC's requirement that on encountering a temporary error, the sender MUST (RFC language does not get stronger than this) retry delivery at a later time and keep trying for a reasonable amount of time.

Spammers generally did not retry as per the RFC specifications, and even early greylisting adopters saw huge drop in the volume of spam that actually made it to mailboxes.

On the other hand, end users would sometimes wonder why their messages were delayed, and some mail administrators did not take well to seeing the volume of data sitting in the mail spool directories grow measurably, if not usually uncontrollably, while successive retries after waiting were in progress.

In what could almost almost appear as a separate, unconnected universe, other network engineers set out to fix the now glaringly obvious omission in the existing RFCs.

A way to announce valid senders was needed, and the specification that was to be known as the Sender Policy Framework (SPF for short) was offered to the world. SPF offered a way to specify which IP addresses valid mail from a domain were supposed to come from, and even included ways to specify how strictly the limitations it presented should be enforced at the receiving end.

The downsides were that all mail handling would need to be upgraded with code that supported the specification, and as it turned out, traditional forwarding such as performed by common mailing list software would not easily be made compatible with SPF.

The flame wars over both methods. You either remember them or should be able to imagine how they played out.

And while the flames grew less frequent and generally less fierce over time, mail volumes grew to the level where operators would have a large number of servers for outgoing mail, and while the site would honor the requirement to retry delivery, the retries would not be guaranteed to come from the same IP address as the original attempt.

It was becoming clear to greylisting practitioners that interpreting published SPF data as known good senders was the most workable way forward. Several of us already had started maintaining nospamd tables (see eg this slide and this), and using the output of

$ host -ttxt domain.tld

(sometimes many times over because some domains use include statements), we generally made do. I even made a habit of publishing my nospamd file.

As hinted in this slide, smtpctl (part of the OpenSMTPd system and in your OpenBSD base system) now since OpenBSD 6.3 is able to retrieve the entire contents of the published SPF information for any domain you feed it.

Looking over my old nospamd file during the last week or so I found enough sedimentary artifacts there, including IP addresses for which there was no explanation and that lacked a reverse lookup, that I turned instead to deciphering which domains had been problematic and wrote a tiny script to generate a fresh nospamd on demand, based on fresh SPF lookups on those domains.

For those wary of clicking links to scripts, it reads like this:

domains=`cat thedomains.txt`
operator="Peter Hansteen <peter@bsdly.net>"

echo "##############################################################################################">$outfile;
echo "# This is the `hostname` nospamd generated from domains at $generatedate. ">>$outfile;
echo "# Any questions should be directed to $operator. ">>$outfile;
echo "##############################################################################################">>$outfile;
echo >>$outfile;

for dom in $domains; do
echo "processing $dom";
echo "# $dom starts #########">>$outfile;
echo >>$outfile;
echo $dom | doas smtpctl spf walk >>$outfile;
echo "# $dom ends ###########">>$outfile;
echo >>$outfile;

echo "##############################################################################################">>$outfile;
echo "# processing done at `date`.">>$outfile;
echo "##############################################################################################">>$outfile;

echo "adding local additions from $locals";
echo "# local additions below here ----" >>$outfile;
cat $locals >> $outfile;

If you have been in the habit of fetching my nospamd, you have been fetching the output of this script for the last day or so.

What it does is simply read a prepared list of domains, run them through smtpctl spf walk and slap the results in a file which you would then load into the pf configuration on your spamd machine. You can even tack on a few local additions that for whatever reason do not come naturally from the domains list.

But I would actually recommend you do not fetch my generated data, and rather use this script or a close relative of it (it's a truly trivial script and you probably can create a better version) and your own list of domains to generate a nospamd tailored to your local environment.

The specific list of domains is derived from more than a decade of maintaining my setup and the specific requests for whitelisting I have received from my users or quick fixes to observed problems in that period. It is conceivable that some domains that were problematic in the past no longer are, and unless we actually live in the same area, some of the domains in my list are probably not relevant to your users. There is even the possibility that some of the larger operators publish different SPF information in specific parts of the world, so the answers I get may not even match yours in all cases.

So go ahead, script and generate! This is your chance to help the robots generate some goodness, for the benefit of your users.

In related news, a request from my new colleagues gave me an opportunity to update the sometimes-repeated OpenBSD and you presentation so it now has at least some information on OpenBSD 6.4. You could call the presentation a bunch of links in a thin wrapper of advocacy and you would not be very wrong.

If you have comments or questions on any of the issues raised in this article, please let me know, preferably via the (moderated) comments field, but I have also been known to respond to email and via various social media message services.

Update 2018-11-11: A few days after I had posted this article, an incident happened that showed the importance of keeping track of both goodness and badness for your services. This tweet is my reaction to a few quick glances at the bsdly.net mail server log:

The downside of maintaining a 55+ thousand entry spamtrap list and whitelisting by SPF is seeing one of the whitelisted sites apparently trying to spam every one of your spamtraps (see https://t.co/ulWt1EloRp). Happening now. Wondering is collecting logs and forwarding worth it?
— Peter N. M. Hansteen (@pitrh) November 9, 2018
A little later I'm clearly pondering what to do, including doing another detailed writeup.
Then again it is an indication that the collected noise is now a required part of the spammer lexicon. One might want to point sites at throwing away outgoing messages to any address on https://t.co/3uthWgKWmL (direct link to list https://t.co/mTaBpF5ucU - beware of html tags!).
— Peter N. M. Hansteen (@pitrh) November 9, 2018
Fortunately I had had some interaction with this operator earlier, so I knew roughly how to approach them. I wrote a couple of quick messages to their abuse contacts and made sure to include links to both my spamtrap resources and a fresh log excerpt that indicated clearly that someone or someones in their network was indeed progressing from top to bottom of the spamtraps list.
I ended up contacting their abuse@ with pointers to the logs that showed evidence of several similar campaigns over the last few days (the period I cared to look at) plus pointers to the spamtrap list and articles. About 30m after the second email to abuse@ the activity stopped.
— Peter N. M. Hansteen (@pitrh) November 10, 2018
As the last tweet says, delivery attempts stopped after progressing to somewhere into the Cs. The moral might be that a list of spamtraps like the one I publish might be useful for other sites to filtering their outgoing mail. Any activity involving the known-bad addresses would be a strong indication that somebody made a very unwise purchasing decision involving address lists.

by Peter N. M. Hansteen (noreply@blogger.com) atNovember 11, 2018 02:56 PM

October 22, 2018

Dag-Erling Smørgrav

DNS over TLS in FreeBSD 12

With the arrival of OpenSSL 1.1.1, an upgraded Unbound, and some changes to the setup and init scripts, FreeBSD 12.0, currently in beta, now supports DNS over TLS out of the box.

DNS over TLS is just what it sounds like: DNS over TCP, but wrapped in a TLS session. It encrypts your requests and the server’s replies, and optionally allows you to verify the identity of the server. The advantages are protection against eavesdropping and manipulation of your DNS traffic; the drawbacks are a slight performance degradation and potential firewall traversal issues, as it runs over a non-standard port (TCP port 853) which may be blocked on some networks. Let’s take a look at how to set it up.

Basic setup

As a simple test case, let’s set up our 12.0-ALPHA10 VM to use Cloudflare’s DNS service:

# uname -r
# cat >/etc/rc.conf.d/local_unbound <<EOF
# service local_unbound start
Performing initial setup.
/var/unbound/forward.conf created
/var/unbound/lan-zones.conf created
/var/unbound/control.conf created
/var/unbound/unbound.conf created
/etc/resolvconf.conf not modified
Original /etc/resolv.conf saved as /var/backups/resolv.conf.20181021.192629
Starting local_unbound.
Waiting for nameserver to start... good
# host www.freebsd.org
www.freebsd.org is an alias for wfe0.nyi.freebsd.org.
wfe0.nyi.freebsd.org has address
wfe0.nyi.freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
wfe0.nyi.freebsd.org mail is handled by 0 .

Note that this is not a configuration you want to run in production—we will come back to this later.


The downside of DNS over TLS is the performance hit of the TCP and TLS session setup and teardown. We demonstrate this by flushing our cache and (rather crudely) measuring a cache miss and a cache hit:

# local-unbound-control reload
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.553 total
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.005 total

Compare this to querying our router, a puny Soekris net5501 running Unbound 1.8.1 on FreeBSD 11.1-RELEASE:

# time host www.freebsd.org gw >x
host www.freebsd.org gw > x 0.00s user 0.00s system 0% cpu 0.232 total
# time host www.freebsd.org >x
host www.freebsd.org gw > x 0.00s user 0.00s system 0% cpu 0.008 total

or to querying Cloudflare directly over UDP:

# time host www.freebsd.org >x      
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.272 total
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.013 total

(Cloudflare uses anycast routing, so it is not so unreasonable to see a cache miss during off-peak hours.)

This clearly shows the advantage of running a local caching resolver—it absorbs the cost of DNSSEC and TLS. And speaking of DNSSEC, we can separate that cost from that of TLS by reconfiguring our server without the latter:

# cat >/etc/rc.conf.d/local_unbound <<EOF
# service local_unbound setup
Performing initial setup.
Original /var/unbound/forward.conf saved as /var/backups/forward.conf.20181021.205328
/var/unbound/lan-zones.conf not modified
/var/unbound/control.conf not modified
Original /var/unbound/unbound.conf saved as /var/backups/unbound.conf.20181021.205328
/etc/resolvconf.conf not modified
/etc/resolv.conf not modified
# service local_unbound start
Starting local_unbound.
Waiting for nameserver to start... good
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.080 total
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.004 total

So does TLS add nearly half a second to every cache miss? Not quite, fortunately—in our previous tests, our first query was not only a cache miss but also the first query after a restart or a cache flush, resulting in a complete load and validation of the entire path from the name we queried to the root. The difference between a first and second cache miss is quite noticeable:

# time host www.freebsd.org >x 
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.546 total
# time host www.freebsd.org >x
host www.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.004 total
# time host repo.freebsd.org >x
host repo.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.168 total
# time host repo.freebsd.org >x
host repo.freebsd.org > x 0.00s user 0.00s system 0% cpu 0.004 total

Revisiting our configuration

Remember when I said that you shouldn’t run the sample configuration in production, and that I’d get back to it later? This is later.

The problem with our first configuration is that while it encrypts our DNS traffic, it does not verify the identity of the server. Our ISP could be routing all traffic to to its own servers, logging it, and selling the information to the highest bidder. We need to tell Unbound to validate the server certificate, but there’s a catch: Unbound only knows the IP addresses of its forwarders, not their names. We have to provide it with names that will match the x509 certificates used by the servers we want to use. Let’s double-check the certificate:

# :| openssl s_client -connect |& openssl x509 -noout -text |& grep DNS
DNS:*.cloudflare-dns.com, IP Address:, IP Address:, DNS:cloudflare-dns.com, IP Address:2606:4700:4700:0:0:0:0:1111, IP Address:2606:4700:4700:0:0:0:0:1001

This matches Cloudflare’s documentation, so let’s update our configuration:

# cat >/etc/rc.conf.d/local_unbound <<EOF
# service local_unbound setup
Performing initial setup.
Original /var/unbound/forward.conf saved as /var/backups/forward.conf.20181021.212519
/var/unbound/lan-zones.conf not modified
/var/unbound/control.conf not modified
/var/unbound/unbound.conf not modified
/etc/resolvconf.conf not modified
/etc/resolv.conf not modified
# service local_unbound restart
Stopping local_unbound.
Starting local_unbound.
Waiting for nameserver to start... good
# host www.freebsd.org
www.freebsd.org is an alias for wfe0.nyi.freebsd.org.
wfe0.nyi.freebsd.org has address
wfe0.nyi.freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
wfe0.nyi.freebsd.org mail is handled by 0 .

How can we confirm that Unbound actually validates the certificate? Well, we can run Unbound in debug mode (/usr/sbin/unbound -dd -vvv) and read the debugging output… or we can confirm that it fails when given a name that does not match the certificate:

# perl -p -i -e 's/cloudflare/cloudfire/g' /etc/rc.conf.d/local_unbound
# service local_unbound setup
Performing initial setup.
Original /var/unbound/forward.conf saved as /var/backups/forward.conf.20181021.215808
/var/unbound/lan-zones.conf not modified
/var/unbound/control.conf not modified
/var/unbound/unbound.conf not modified
/etc/resolvconf.conf not modified
/etc/resolv.conf not modified
# service local_unbound restart
Stopping local_unbound.
Waiting for PIDS: 33977.
Starting local_unbound.
Waiting for nameserver to start... good
# host www.freebsd.org
Host www.freebsd.org not found: 2(SERVFAIL)

But is this really a failure to validate the certificate? Actually, no. When provided with a server name, Unbound will pass it to the server during the TLS handshake, and the server will reject the handshake if that name does not match any of its certificates. To truly verify that Unbound validates the server certificate, we have to confirm that it fails when it cannot do so. For instance, we can remove the root certificate used to sign the DNS server’s certificate from the test system’s trust store. Note that we cannot simply remove the trust store entirely, as Unbound will refuse to start if the trust store is missing or empty.

While we’re talking about trust stores, I should point out that you currently must have ca_root_nss installed for DNS over TLS to work. However, 12.0-RELEASE will ship with a pre-installed copy.


We’ve seen how to set up Unbound—specifically, the local_unbound service in FreeBSD 12.0—to use DNS over TLS instead of plain UDP or TCP, using Cloudflare’s public DNS service as an example. We’ve looked at the performance impact, and at how to ensure (and verify) that Unbound validates the server certificate to prevent man-in-the-middle attacks.

The question that remains is whether it is all worth it. There is undeniably a performance hit, though this may improve with TLS 1.3. More importantly, there are currently very few DNS-over-TLS providers—only one, really, since Quad9 filter their responses—and you have to weigh the advantage of encrypting your DNS traffic against the disadvantage of sending it all to a single organization. I can’t answer that question for you, but I can tell you that the parameters are evolving quickly, and if your answer is negative today, it may not remain so for long. More providers will appear. Performance will improve with TLS 1.3 and QUIC. Within a year or two, running DNS over TLS may very well become the rule rather than the experimental exception.

by Dag-Erling Smørgrav atOctober 22, 2018 09:36 AM

July 10, 2018

Nicolai Langfeldt

Epost er så 1995!

Her om dagen ble jeg gjort oppmerksom på at friprog senteret strever litt, de gjør greie for det i Farvel epost.

At det skal være lettere å følge opp henvendelser på twitter/linkedin/facebook virker mildest talt merkelig. At det skal gjøre det lettere for dem å ignorere eller svare nei på henvendelser de burde ignorere eller svare nei på virker også merkelig. Kan ikke tro at det vil gjøre bildet av henvendelser og hva som er svart på mindre oversiktlig. Status sefæren er et sosialt rom, ikke egentlig et saks- og henvendelses-behandlings-rom. Antar uten videre at de som bruker twitter/facebook/... seriøst til slikt sørger for å hente henvendelsene inn i saks- og henvendelses-systemet sitt så de kan se hva de har tatt stilling til og behandlet.

Nuvel, spent på hva de må gjøre for at dette skal lykkes - for andre verdier av "lykkes" enn "jeg følger ikke med på twitter" >:-)

by Nicolai Langfeldt (noreply@blogger.com) atJuly 10, 2018 05:54 AM

Getting rid of the DVDs

Some years ago I ripped my CDs and have since stored all my music on disk (quite a few disks and now also my phone). The CDs have been relegated to the basement.

For some time I've been meaning to do the same with the DVDs, so 3 years ago I made sure my new server had a BD capable optical drive. And a lot of disk.  It has since received a lot more disk (just because I could, and with the DVD project in mind), but the disks are not exactly full. Life as a dad and husband with a newly built house does not lend itself to other projects.

But after the kids ruined yet another DVD the other day - by having left it on the floor - I resolved to use some late evening hours to work on the problem.

Going back in time two years now: The first stage was clear to me: makemkv. It decrypts the DVDs and stores all the contents (all video angles, all audio tracks, all subtitles) in a MKV file. Thus I should have good and complete source material and won't have to rip it from disk again later. makemkv is easy to operate, unlike dvd::rip. It's even easier HandBrake too - it does less than both - KISS in practice! In addition to decrypting whatever disk I feed it it works around that silly DVD zone problem so I can rip all my US disks as well as my European disks, no sweat, without changing the zone of the drive or messing with drive firmware.

I had hoped that the next step was simple: DLNA playback via the familys "smart" Samsung BD player. Samsung supports all kinds of formats including CD-XA in Matroska containers.  But the first tests were disappointing - the video stuttered. Not sure if it was the computing power of the player or the lack of  bandwidth in the WiFi network or the USB-WiFi dongle sitting in the behind of the player. But my laptop can play back the same mkv files over the same WiFi without stuttering.  Also, and much worse, it would not let me change audio tracks or the subtitles when doing DLNA playback.  But other HD content in MKV containers it plays very well over WiFi.  Just not the full bandwidth DVD stream which I have a few of on disk.

So, problem on the back burner quite a while.

As time passed I discovered that "MX Player" on my Android phone could play pretty much anything.

Also as time passed I had been considering Apple TV running Linux. Some other mini box running XBMC.  And Popcorn Hour - the A-400 model seems compelling. I've been tempted by TV-dongels running Android - one just might hang from one of the HDMI inputs of the amplifier in the back of the shelf. And there are a number of other media boxes based on Android as well.  The thing is I'm reluctant to add boxes to the family unfortunate low space media center. A old phone or a Android dongle should be just the ticket. Getting out the old Samsung Galaxy S II phone proved that the full-bandwidth DVD stream made that too stutter (and I plan to use the S III as a phone a while longer). I then grew dubious to how useful some other Android device with a lesser CPU might handle my media streams. And also cynical about how many OS upgrades they would receive after purchasing them.

Seems I will have to transcode the streams to some lower bit rate already proven to be handled by the phone... or indeed the Samsung BD player! OK scratch Android, bring back the BD player.

Next up, how to transcode from full-content MKV to lower bit-rate-with-selected-content video file as easy as apple pie (you'll have to wait while I write the code ;-)  It seems to be given that I will use the HandBrake CLI tool though.  It's a GOOD deal less scary than any other transcoding tool I have seen or attempted to use over the years. And it can use all the CPU cores in computer.

Honorable mentions:

If I had stayed with the old phone or other Android platform alternative I would have had to do the same transcoding to get lower bit rates.  And then I would probably have used BSPlayer which supports playback from Samba shares among other things.

Also, MediaHouse is a very good DLNA-browser that supports IMDB lookups (including Movie posters) and that can give a handle on remote content to MX Player and have it played even though MX Player does not support DLNA. Unfortunately that precludes fast forward, rewind or even resume of the video which sucks if you watch anything longer than 5 minutes.

by Nicolai Langfeldt (noreply@blogger.com) atJuly 10, 2018 05:50 AM

April 14, 2018

NUUG Foundation

Perl Toolchain Summit 2018

NUUG Foundation støtter Perl Toolchain Summit 2018 Konferansen for utviklerne bak Perl holdes i Oslo 19.-22. april 2018

April 14, 2018 02:35 PM

January 04, 2018

Holder de ord

Her er Venstre, Høyre og Fremskrittspartiet enige

Venstre har denne uken startet regjeringsforhandlinger med Høyre og Fremskrittspartiet.

Pressen har skrevet flere spaltemeter om de største kampsakene og om hvilke porteføljer som er de viktigste for de tre partiene. De er uenige om mangt, både innenfor klimafeltet, innvandringsfeltet og forholdet til EU. Men det er også en del saker partiene er enige om. Holder de ord har derfor gått gjennom partiprogrammene til de tre for perioden 2017-21 for å finne områdene der partiene er enige seg imellom. Partiprogrammene finner du her.

Familie og helse

På et overordnet nivå ønsker alle tre partier at offentlige og private tjenestetilbydere skal likestilles. De ønsker også et tettere samarbeid mellom skole, barnevern, politi og eventuelle andre aktører på oppvekstfeltet. Venstre og FrP lover også begge at det skal være økt satsing på tidlig innsats og forebygging i helsevesenet, og at frivillige og ideelle aktører i større grad skal bidra her. I en eventuell regjeringserklæring for disse tre partiene vil vi kunne forvente at fosterfamilier blir lovet styrkede rettigheter, at ettervern for barnevernsbarn igjen blir satt på agendaen, og at helsestasjonene blir lovet styrket.

Ettervernet ble lovet gjennomgått i Sundvolden-erklæringen fra 2013, men ble ikke fulgt opp i Solbergs første regjeringsperiode. Styrking av helsestasjoner og økt satsing på velferdsteknologi, som alle tre partier er opptatt av i inneværende periode, ble derimot gjennomført i forrige regjeringsperiode.

Utenriks og nordområder

All tre partier lover fortsatt norsk militær tilstedeværelse i nordområdene, noe som ikke er politisk kontroversielt, selv om de fleste av opposisjonspartiene har et annet fokus enn det militære når nordområdene omtales.

Alle tre partier lover også en reduksjon av tollbarrierer, slik at utviklingsland skal ha bedre tilgang til det norske markedet, men det er neppe noe som vil bli prioritert høyt. Solberg1-regjeringen lovte også i forrige periode at det skulle jobbes for bedre ordninger som ville gi utviklingsland tilgang til norske markeder, men det ble ikke igangsatt noen konkrete tiltak for å nå dette målet.

Næringsliv og finanser

Alle tre partier har vært, og er fortsatt, svært negative til formuesskatten, men Venstre har endret standpunkt noe siden forrige periode. Venstre vil nå redusere skatten gjennom en gradvis økning av bunnfradraget, men nevner ikke lenger at skatten skal fjernes i sitt partiprogram for inneværende periode. En regjering bestående av disse tre partiene vil imidlertid måtte søke støtte hos Senterpartiet for å få gjennomført større endringer i formuesskatten. De øvrige partiene vil enten øke skatten for større formuer, eller, der de er opptatt av reduksjon, kun ønsker reduksjon for skatt på arbeidende kapital.

Alle tre partier har programfestet at kontanter skal beholdes som pliktig betalingsmiddel. Ingen partier har vedtatt at de ønsker å fjerne dette, så dette er en ikke-sak i norsk politisk sammenheng.

Mer substansielt er løftene fra alle tre partier om at det med dem ved makta skal bli enklere å starte egen bedrift. FrP og Høyre utdyper ikke i sine partiprogram på hvilken måte det skal bli enklere, mens Venstre peker på stimulering til økt risikovilje for kapitaltilførsel.

Skole og utdanning

Alle tre partier vil bygge flere studentboliger. FrP har imidlertid gått vekk fra tallet 2000, som de hadde vedtatt skulle bygges i forrige periode, mens Venstre har økt ambisjonene fra 2000 boliger til 3000. Både Høyre og Venstre ønsker også å gjøre studentboliger til et eget reguleringsformål og plan- og bygningsloven.

Alle tre partier går inn for å sikre den pågående innfasingen av 11 måneders studiestøtte.

Alle tre er også enige om at de ønsker åpne tilgjengelige data om skoleresultater. Dette finnes på skoleporten.no, så disse løftene må tolkes som en garanti for at dette ikke endres.

Det kan bli noe uenighet om hvordan friskoler skal behandles, men alle tre partier er positive til konseptet.

Den mest kostbare og vanskeligst gjennomførbare enigheten vil være bedre språkopplæring i barnehagene, da det allerede er vanskelig å få nok pedagogisk utdannet personell til barnehagene. En styrking av finansieringen til universiteter og høgskoler, som både Venstre og FrP har gått til valg på, vil også bli en utgiftspost dersom de to partiene får med seg Høyre på dette.

I tillegg er de tre partiene enige om at entreprenørskap på et eller annet vis bør inn i skolen, at skoleelever skal ha mer fysisk aktivitet og at skolepersonell skal ha tilbud om etter- og videreutdanning.

Kultur og frivillighet

Kultursektoren kan forvente å bli lovet enklere regler og mindre søknadsbyråkrati, mens frivilligheten kan forvente lovnader om en forbedret momskompensasjonsordning. At Venstre får gjennomslag for sitt ønske om å trappe opp momskompensasjonen til 100 prosent er imidlertid usannsynlig. I forrige regjeringsperiode lovet de blåblå at ordningen skulle forbedres, men nøyde seg med en styrking av potten. For inneværende periode lover Høyre kun at ordningen skal være «god og forutsigbar».

Klima, energi og landbruk

Det er liten tvil om at konsekvensutredning av oljeutvinning av Lofoten, Vesterålen og Senja vil være en av de store stridstemaene under forhandlingene. Det er imidlertid enighet om at det skal satses mer på vannkraft, og Venstre og FrP er enige om at skogen skal brukes aktivt som klimatiltak, med blant annet planting og gjødsling av skog. Høyre nevner ikke dette spesifikt.

Høyre er enig med Venstre om at det bør produseres mer biodrivstoff i Norge, at arbeidet med å rydde opp forurensede masser på havbunnen må fortsette, samt at det skal jobbes for å hindre spredning av fremmede arter. FrP omtaler ikke noen av disse temaene i sitt partiprogram.

Alle tre partier lover å oppheve odelsbestemmelsene i Grunnloven. Dette ble forsøkt også i forrige periode, men regjeringen klarte ikke å skaffe flertall for dette.

Det kan også hende at FrP og Venstre vil få med seg Høyre på å arbeide internasjonalt for et forbud mot dumping av fisk.

Justis og beredskap

Alle tre partier er positive til forsøket med dyrepoliti som ble startet i forrige periode, og vi kan forvente at det blir lovet å utvide denne ordningen i en eventuell regjeringserklæring. Det er også enighet om at soningskøene skal ned, og det er da naturlig å forvente at det vil bli lovet bygging av flere soningsplasser. Alle tre partier vil også at utenlandske domfelte i større grad skal sone straffen sin i hjemlandet.

Både Høyre og Venstre ønsker å tillate dobbelt statsborgerskap. FrP er mot dette, men det er flertall for dobbelt statsborgerskap på Stortinget, så det kan hende dette blir gjennomført i denne perioden.

by Hanna Tranås (hanna@holderdeord.no) atJanuary 04, 2018 11:57 AM

December 19, 2017

NUUG news

Kort rapport fra to dager i tingretten om DNS-domene-inndragning

Det har vært stille her på bloggen om DNS-beslag-saken, men det har ikke vært stille i saken. Vi har rett og slett ikke rukket blogge på grunn av høy arbeidsbelastning på jobb og privaten, men her er endelig en liten oppdatering. Vi ville satt umåtelig pris på at du viste din støtte til vår innsats her ved å donere penger til forsvarsfondet. Til nå er det sikkert påløpt mer enn 100 000,- i advokatutgifter, og vi har ikke fått i nærheten av dette i donasjoner. Vi holder ut, men det blir enklere med hjelp.

I dag var NUUG, EFN og IMC i tingretten etter at IMC nektet å godta inndragning av DNS-domenet popcorn-time.no. Dette er neste steg i saken som startet med Økokrims beslag av det samme domenet. Siden sist bloggpost har anken av beslaget til lagmannsretten ikke blitt tatt til følge, og anken til høyesterett blitt avvist. Det satte et foreløbig punktum for beslags-saken, og i mellomtiden har Økokrim avsluttet etterforskningen, konkludert med at de ikke har noen å sikte, og at de vil inndra DNS-domenet. Inndragning er visst det de kaller det når politiet tar ting etter at etterforskning er avsluttet, mens beslag er når de tar noe før etterforskningen er avsluttet. Dermed var vi i tingretten for å protestere på inndragningen, for å få bedre belyst sakens prinsipielle sider.

En av de sakkyndige vitnene var Håkon Wium Lie, som har tegnet og skrevet følgende oppsummering av de to dagene i retten:

[kjappe notater fra Follo Tingrett]

Dag 1. Rettens leder (Jonn Ola Sørensen) har med seg to meddommere, en kvinne og en mann som begge ser ut å være eldre enn ham. Ingen av dem er vel i kjernegruppa av popcorn-time brukere, men hvem vet: vitnet Rune Ljostad har forteller oss at Popcorntime brukes av svært mange i Norge. Om dette skyldes Dagsrevyens reportasje:


... eller Aftenpostens artikler:


eller lenker fra Google, eller det nå stengte nettsiden:


er et tema i saken.

Aktor Maria Bache Dahl fra Økokrim og Ola Tellesbø, som reprsenterer den norske registraren, hadde hvert sitt innledningsforedrag fra hver sin side i saken. Deretter presenterte advokater fra NUUG og EFN kort sine standpunkt.

Så var det tid for vitner. Først registrar Morten E. Eriksen som fortalte hvordan han enkelt registrerer navn i .no-domenet uten å vite hvem som står bak eller hva slags innhold som legges på sidene. På denne måten får han ofte høre om navn på filmer eller produkter som slippes neste år.

Advokat Rune Ljostad stilte som vitne for Økokrim. Han er profesjonell piratjeger og orienterte om andre lignende saker der norske og andre myndigheter stenger nettsted eller tar domenenavn. Da han ble spurt om det var en "Play"-knapp på popcorn-time.no ville han ikke svare, og Maria Bache Dahl kom raskt inn og sa at Rune var ikke innkalt for å snakke om popcorn-time.no, men om det hun kaller "tjenesten Popcorntime".

Dette ser ut til å bli et annet spørsmål: hva er popcorn-time? En tjeneste? En protokoll? Et nettsted? En avspiller? En søkemotor?

Dommerens kommentarer tyder på at han sliter med å sortere begrepene. Rune Ljostad har en god teknisk forståelse. Det har Petter Reinholdtsen også, men han fikk ikke slippe til i dag pga. tidsmangel.

Siste vitne i dag er Dani Bacsa som er internett "investigator" for Motion Picture Association. Han bor i Brussels og har flydd inn for anledningen. Han kom langt for å fortelle det mange andre vitner i saken også kunne sagt. Ingen er uenige om de tekniske sidene ved saken. Her er en kort oppsummering:

Spørsmålet i saken er om dette kvalifiserer til "medvirkning" til en forbytelse. Dersom svaret er ja vil retten antageligvis opprettholde Økokrims beslagleggelse av popcorn-time.no. I motsatt fall vil nettnavnet formodentlig bli returnert til den/dem som registrerte navnet.

Dag 2 startet med at Petter Reinholdtsen forklarte hvorfor Rune Ljostads tall fra i går er feil. Petter argumenterte at tallet på filmer i det fri er langt høyere enn det Rune Ljostad anslo (som var 1%).

Etterpå snakket Willy Johansen, som representerer dem som selger filmer på fysiske media, om hvordan salget har gått ned. Popcorn-time fikk mye av skylda, men han visste ikke hvor stor innvirkning popcorn-time.no hadde hatt. Han sa videre at det er svært enkelt å laste opp filmer fra mobiltelefonen og at disse da automatisk ville spres til alle ens kontakter.

Morten Vestergaard Stephanson fortalte hvor mye det koster å produsere filmer og behovet for inntekter. Han nevnte også at mye at filmproduksjonen er offentlig finansiert i Norge.

Tom Fredrik Blenning presentert saken fra EFNs side: EFN og samarbeidende organisasjoner jobber mot sensur og for ytringsfrihet. Å beslaglegge popcorn-time.no er et inngrep i ytringsfriheten. Han sa også at automatisk spredning av filmer til alle kontakter på mobilen er oppspinn.

Wilhelm Joys Andersen hadde analysert kildekoden bak popcorn-time.no og konkluderte at siden er optimalisert for søkemotorer -- de aller fleste som så siden må ha kommet fra en søkemotor og dermed har vært interessert i temaet allerede. Formålet med å opprette siden er antageligvis å bli funnet av brukere som søker etter "popcorntime" i en søkemotor. Annonser for VPN-tjenester framtredende på siden.

Håkon Wium Lie var siste vitne. Har forklarte at nettleseren Opera har lagt inn støtte for både bittorrent-protokollen og VPN -- teknologien har legitim bruk og de fleste større norske bedrifter tilbyr VPN for ansatte. Mye fildeling er ulovlig i følge norsk rett, men skadevirkningene er begrenset: uten piratebay ville vi ikke fått Spotify og Netflix. Nettsiden popcorn-time.no var svært perifer i forhold til hovedgjerningen, som er ulovlig opplasting av filmer. Toleransen for å skrive om og lenke til kontroversielle tema må være høy i Norge, og fildeing utgjør ikke noe samfunnsproblem på linje med oppfordring til terror.

Etter lunsj holder aktor Maria Bache Dahl sin prosedyre. Hun mener, ikke helt overraskende, at popcorn-time.no var medvirkende til brudd på på opphavsretten og at nettsiden popcorn-time.no må vurderes anneledes enn en artikkel i Aftenposten.

Hun nedlegger påstand om at registrar må tåle inndragning av domenet popcorn-time.no. Hun argumenterer hvorfor.

Ola Tellesbø starter med å nedlegge påstand om at inndragning av bruksretten til popcorn-time.no ikke tåles, og at Økokrim skal betale saksomkostninger. Han argumenterer hvorfor ikke.

Kirill forsetter og sier at "opplasting av kildefiler", som aktor mener er ulovlig, er et for vagt begrep til at det kan regnes om ulovlig. Nå brukere laster opp til YouTube gjøre de akkurat det samme.

Ola fortsetter og trekker fram DVD-Jon saken. Han leser fra dommen, hvor Jon ble frikjent for medvirkning. Han argumeterer at nettsiden er perifer i forhold til hovedgjerningen, og at de som leste teksten kanskje heller ble forvirret enn oppmuntret. Han mener videre at "medvirkning" hører til narko- og drapassaker, og ikke opphavsrett.

Advokat Kjetil Wick Sætre, som representerter NUUG, gir så en kortere kommentar. Han nevner bl.a. at NUUG har tilbudt seg å overta domenet for å bruke det på en klart lovlig måte. NUUG støtter ikke ulovlig fildeling, men mener teknologien også har legitim bruk som skaper verdier. NUUG mener ulovlig bruk bør håndheves så nært lovbruddet som mulig, dvs. ved opplasting.

Advokat Henny Hallingskog-Hultin, som representerte EFN, sa at kjernen i saken er vekting mellom ytringfrihet og håndhevelse av immaterielle rettigheter. EFN anser at ytringene på nettstedet er for langt fra hovedgjerningen -- I denne saken er det mange ledd mellom opplasting av kildefiler (som er den ulovlige hovedgjerningen) og nettstedet popcorn-time.no. Inndragning at popcorn-time.no har minimal innvirkning for ulovlig nedlasting, og inndragning vil ha en "nedkjølende effekt". Saken er et misbruk av opphavsrettigheter for å sensurere internett.

Etter en kort pause kom det en kort replikk fra aktor.

Kirills hovedpunkt, på vegne av alle, var at Økokrim bruker uforholdsmessig store ressurser på saken. Dette er den 5te (!) rettsrunden i Follo Tingrett. Advokatene jobber, met ett unntak, gratis fordi saken har prisipiell betydning.

Dommen forkynnes 15 januar kl 10. Ankefristen er 2 uker.

Retten ble hevet 16:45 og hoveddommer ønsket alle en god jul.

December 19, 2017 08:30 PM

October 23, 2017

Espen Braastad

ZFS NAS using CentOS 7 from tmpfs

Following up on the CentOS 7 root filesystem on tmpfs post, here comes a guide on how to run a ZFS enabled CentOS 7 NAS server (with the operating system) from tmpfs.


Preparing the build environment

The disk image is built in macOS using Packer and VirtualBox. Virtualbox is installed using the appropriate platform package that is downloaded from their website, and Packer is installed using brew:

$ brew install packer

Building the disk image

Three files are needed in order to build the disk image; a Packer template file, an Anaconda kickstart file and a shell script that is used to configure the disk image after installation. The following files can be used as examples:

Create some directories:

$ mkdir ~work/centos-7-zfs/
$ mkdir ~work/centos-7-zfs/http/
$ mkdir ~work/centos-7-zfs/scripts/

Copy the files to these directories:

$ cp template.json ~work/centos-7-zfs/
$ cp ks.cfg ~work/centos-7-zfs/http/
$ cp provision.sh ~work/centos-7-zfs/scripts/

Modify each of the files to fit your environment.

Start the build process using Packer:

$ cd ~work/centos-7-zfs/
$ packer build template.json

This will download the CentOS 7 ISO file, start an HTTP server to serve the kickstart file and start a virtual machine using Virtualbox:

Packer installer screenshot

The virtual machine will boot into Anaconda and run through the installation process as specified in the kickstart file:

Anaconda installer screenshot

When the installation process is complete, the disk image will be available in the output-virtualbox-iso folder with the vmdk extension.

Packer done screenshot

The disk image is now ready to be put in initramfs.

Putting the disk image in initramfs

This section is quite similar to the previous blog post CentOS 7 root filesystem on tmpfs but with minor differences. For simplicity reasons it is executed on a host running CentOS 7.

Create the build directories:

$ mkdir /work
$ mkdir /work/newroot
$ mkdir /work/result

Export the files from the disk image to one of the directories we created earlier:

$ export LIBGUESTFS_BACKEND=direct
$ guestfish --ro -a packer-virtualbox-iso-1508790384-disk001.vmdk -i copy-out / /work/newroot/

Modify /etc/fstab:

$ cat > /work/newroot/etc/fstab << EOF
tmpfs       /         tmpfs    defaults,noatime 0 0
none        /dev      devtmpfs defaults         0 0
devpts      /dev/pts  devpts   gid=5,mode=620   0 0
tmpfs       /dev/shm  tmpfs    defaults         0 0
proc        /proc     proc     defaults         0 0
sysfs       /sys      sysfs    defaults         0 0

Disable selinux:

echo "SELINUX=disabled" > /work/newroot/etc/selinux/config

Disable clearing the screen on login failure to make it possible to read any error messages:

mkdir /work/newroot/etc/systemd/system/getty@.service.d
cat > /work/newroot/etc/systemd/system/getty@.service.d/noclear.conf << EOF

Now jump to the Initramfs and Result sections in the CentOS 7 root filesystem on tmpfs and follow those steps until the end when the result is a vmlinuz and initramfs file.

ZFS configuration

The first time the NAS server boots on the disk image, the ZFS storage pool and volumes will have to be configured. Refer to the ZFS documentation for information on how to do this, and use the following command only as guidelines.

Create the storage pool:

$ sudo zpool create data mirror sda sdb mirror sdc sdd

Create the volumes:

$ sudo zfs create data/documents
$ sudo zfs create data/games
$ sudo zfs create data/movies
$ sudo zfs create data/music
$ sudo zfs create data/pictures
$ sudo zfs create data/upload

Share some volumes using NFS:

zfs set sharenfs=on data/documents
zfs set sharenfs=on data/games
zfs set sharenfs=on data/music
zfs set sharenfs=on data/pictures

Print the storage pool status:

$ sudo zpool status
  pool: data
 state: ONLINE
  scan: scrub repaired 0B in 20h22m with 0 errors on Sun Oct  1 21:04:14 2017

	data        ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    sdd     ONLINE       0     0     0
	    sdc     ONLINE       0     0     0
	  mirror-1  ONLINE       0     0     0
	    sda     ONLINE       0     0     0
	    sdb     ONLINE       0     0     0

errors: No known data errors

October 23, 2017 11:20 PM

April 19, 2017

Holder de ord

BETALT JOBB: Kategorisering av valgløfter

Interessert i politikk? Er du student som ønsker å tjene noen ekstra slanter ved siden av studiene?

Holder de ord er en partipolitisk uavhengig organisasjon med mål om å gjøre det enklere å følge med på norsk stortingspolitikk. Blant tjenestene vi tilbyr er en fullstendig løftedatabase som per i dag består av alle løftene fra de åtte stortingspartienes partiprogram for periodene 2009-2013 og 2013-2017.

I forbindelse med valgåret 2017 skal databasen oppdateres med løftene til alle partiene som er representert på Stortinget. Vårens siste landsmøte avsluttes 21. mai 2017. Da vil alle åtte parti på Stortinget ha vedtatt nye program for perioden 2017-2021. Totalt vil det erfaringsmessig da ha kommet om lag 7.000 nye løfter. Alle disse løftene skal inn i Holder de ords løftedatabase.


Dette er en manuell jobb. Hvert enkelt løfte i partiprogrammene kopieres over i et excel-ark og gis kategorier etter Stortingets kategorisystem. Noe omskrivningsarbeid må påregnes slik at løftene kan stå på egne ben, uten nødvendig kontekst. Dette importeres så til Holder de ords nettdatabase.

I tillegg innebærer jobben å grovsortere aktuelle løfter til bruk i Holder de ords chat bot. Egne retningslinjer for disse vil bli gitt. Denne delen av jobben vil ikke påvirke den totale arbeidsmengden.


Det inngås kontrakt hvor det avtales lønn per partiprogram. I utgangspunktet ser vi gjerne at den som påtar seg oppdraget kategoriserer alle eller flere enn to partiprogram. Ved for sen levering kan dagbøter påløpe.

Send en kort søknad som epost til Tiina Ruohonen og Hanna Tranås merket “Valgløfter” i emnefeltet.

by Hanna Tranås (hanna@holderdeord.no) atApril 19, 2017 07:57 PM

February 13, 2017

Mimes brønn

En innsynsbrønn full av kunnskap

Mimes brønn er en nettjeneste som hjelper deg med å be om innsyn i offentlig forvaltning i tråd med offentleglova og miljøinformasjonsloven. Tjenesten har et offentlig tilgjengelig arkiv over alle svar som er kommet på innsynsforespørsler, slik at det offentlige kan slippe å svare på de samme innsynshenvendelsene gang på gang. Du finner tjenesten på


I følge gammel nordisk mytologi voktes kunnskapens kilde av Mime og ligger under en av røttene til verdenstreet Yggdrasil. Å drikke av vannet i Mimes brønn ga så verdifull kunnskap og visdom at den unge guden Odin var villig til å gi et øye i pant og bli enøyd for å få lov til å drikke av den.

Nettstedet vedlikeholdes av foreningen NUUG og er spesielt godt egnet for politisk interesserte personer, organisasjoner og journalister. Tjenesten er basert på den britiske søstertjenesten WhatDoTheyKnow.com, som allerede har gitt innsyn som har resultert i dokumentarer og utallige presseoppslag. I følge mySociety for noen år siden gikk ca 20 % av innsynshenvendelsene til sentrale myndigheter via WhatDoTheyKnow. Vi i NUUG håper NUUGs tjeneste Mimes brønn kan være like nyttig for innbyggerne i Norge.

I helgen ble tjenesten oppdatert med mye ny funksjonalitet. Den nye utgaven fungerer bedre på små skjermer, og viser nå leveringsstatus for henvendelsene slik at innsender enklere kan sjekke at mottakers epostsystem har bekreftet mottak av innsynshenvendelsen. Tjenesten er satt opp av frivillige i foreningen NUUG på dugnad, og ble lansert sommeren 2015. Siden den gang har 121 brukere sendt inn mer enn 280 henvendelser om alt fra bryllupsutleie av Operaen og forhandlinger om bruk av Norges topp-DNS-domene .bv til journalføring av søknader om bostøtte, og nettstedet er en liten skattekiste av interessant og nyttig informasjon. NUUG har knyttet til seg jurister som kan bistå med å klage på manglende innsyn eller sviktende saksbehandling.

– «NUUGs Mimes brønn var uvurderlig da vi lyktes med å sikre at DNS-toppdomenet .bv fortsatt er på norske hender,» forteller Håkon Wium Lie.

Tjenesten dokumenterer svært sprikende praksis i håndtering av innsynshenvendelser, både når det gjelder responstid og innhold i svarene. De aller fleste håndteres raskt og korrekt, men det er i flere tilfeller gitt innsyn i dokumenter der ansvarlig etat i ettertid ønsker å trekke innsynet tilbake, og det er gitt innsyn der sladdingen har vært utført på en måte som ikke skjuler informasjonen som skal sladdes.

– «Offentlighetsloven er en bærebjelke for vårt demokrati. Den bryr seg ikke med hvem som ber om innsyn, eller hvorfor. Prosjektet Mimes brønn innebærer en materialisering av dette prinsippet, der hvem som helst kan be om innsyn og klage på avslag, og hvor dokumentasjon gjøres offentlig. Dette gjør Mimes Brønn til et av de mest spennende åpenhetsprosjektene jeg har sett i nyere tid.» forteller mannen som fikk åpnet opp eierskapsregisteret til skatteetaten, Vegard Venli.

Vi i foreningen NUUG håper Mimes brønn kan være et nyttig verktøy for å holde vårt demokrati ved like.

by Mimes Brønn atFebruary 13, 2017 02:07 PM

January 06, 2017

Espen Braastad

CentOS 7 root filesystem on tmpfs

Several years ago I wrote a series of posts on how to run EL6 with its root filesystem on tmpfs. This post is a continuation of that series, and explains step by step how to run CentOS 7 with its root filesystem in memory. It should apply to RHEL, Ubuntu, Debian and other Linux distributions as well. The post is a bit terse to focus on the concept, and several of the steps have potential for improvements.

The following is a screen recording from a host running CentOS 7 in tmpfs:


Build environment

A build host is needed to prepare the image to boot from. The build host should run CentOS 7 x86_64, and have the following packages installed:

yum install libvirt libguestfs-tools guestfish

Make sure the libvirt daemon is running:

systemctl start libvirtd

Create some directories that will be used later, however feel free to relocate these to somewhere else:

mkdir -p /work/initramfs/bin
mkdir -p /work/newroot
mkdir -p /work/result

Disk image

For simplicity reasons we’ll fetch our rootfs from a pre-built disk image, but it is possible to build a custom disk image using virt-manager. I expect that most people would like to create their own disk image from scratch, but this is outside the scope of this post.

Use virt-builder to download a pre-built CentOS 7.3 disk image and set the root password:

virt-builder centos-7.3 -o /work/disk.img --root-password password:changeme

Export the files from the disk image to one of the directories we created earlier:

guestfish --ro -a /work/disk.img -i copy-out / /work/newroot/

Clear fstab since it contains mount entries that no longer apply:

echo > /work/newroot/etc/fstab

SELinux will complain about incorrect disk label at boot, so let’s just disable it right away. Production environments should have SELinux enabled.

echo "SELINUX=disabled" > /work/newroot/etc/selinux/config

Disable clearing the screen on login failure to make it possible to read any error messages:

mkdir /work/newroot/etc/systemd/system/getty@.service.d
cat > /work/newroot/etc/systemd/system/getty@.service.d/noclear.conf << EOF


We’ll create our custom initramfs from scratch. The boot procedure will be, simply put:

  1. Fetch kernel and a custom initramfs.
  2. Execute kernel.
  3. Mount the initramfs as the temporary root filesystem (for the kernel).
  4. Execute /init (in the initramfs).
  5. Create a tmpfs mount point.
  6. Extract our CentOS 7 root filesystem to the tmpfs mount point.
  7. Execute switch_root to boot on the CentOS 7 root filesystem.

The initramfs will be based on BusyBox. Download a pre-built binary or compile it from source, put the binary in the initramfs/bin directory. In this post I’ll just download a pre-built binary:

wget -O /work/initramfs/bin/busybox https://www.busybox.net/downloads/binaries/1.26.1-defconfig-multiarch/busybox-x86_64

Make sure that busybox has the execute bit set:

chmod +x /work/initramfs/bin/busybox

Create the file /work/initramfs/init with the following contents:

#!/bin/busybox sh

# Dump to sh if something fails
error() {
	echo "Jumping into the shell..."
	setsid cttyhack sh

# Populate /bin with binaries from busybox
/bin/busybox --install /bin

mkdir -p /proc
mount -t proc proc /proc

mkdir -p /sys
mount -t sysfs sysfs /sys

mkdir -p /sys/dev
mkdir -p /var/run
mkdir -p /dev

mkdir -p /dev/pts
mount -t devpts devpts /dev/pts

# Populate /dev
echo /bin/mdev > /proc/sys/kernel/hotplug
mdev -s

mkdir -p /newroot
mount -t tmpfs -o size=1500m tmpfs /newroot || error

echo "Extracting rootfs... "
xz -d -c -f rootfs.tar.xz | tar -x -f - -C /newroot || error

mount --move /sys /newroot/sys
mount --move /proc /newroot/proc
mount --move /dev /newroot/dev

exec switch_root /newroot /sbin/init || error

Make sure it is executable:

chmod +x /work/initramfs/init

Create the root filesystem archive using tar. The following command also uses xz compression to reduce the final size of the archive (from approximately 1 GB to 270 MB):

cd /work/newroot
tar cJf /work/initramfs/rootfs.tar.xz .

Create initramfs.gz using:

cd /work/initramfs
find . -print0 | cpio --null -ov --format=newc | gzip -9 > /work/result/initramfs.gz

Copy the kernel directly from the root filesystem using:

cp /work/newroot/boot/vmlinuz-*x86_64 /work/result/vmlinuz


The /work/result directory now contains two files with file sizes similar to the following:

ls -lh /work/result/
total 277M
-rw-r--r-- 1 root root 272M Jan  6 23:42 initramfs.gz
-rwxr-xr-x 1 root root 5.2M Jan  6 23:42 vmlinuz

These files can be loaded directly in GRUB from disk, or using iPXE over HTTP using a script similar to:

kernel http://example.com/vmlinuz
initrd http://example.com/initramfs.gz

January 06, 2017 08:34 PM

November 12, 2016

Anders Einar Hilden

Perl Regexp Oneliners and UTF-8

For my project to find as many .no domains as possible, I needed a regexp for extracting valid domains. This task is made more fun by the inclusion of Norwegian and Sami characters in the set of valid characters.

In addition to [a-z0-9\-], valid dot-no domains can contain the Norwegian æ (ae), ø (o with stroke) and å (a with ring above) (Stargate, anyone?) and a number of Sami characters. ŧ (t with stroke), ç (c with cedilla) and ŋ (simply called “eng”) are some of my favourites.

The following code will print only the first match per line, and uses ŧ directly in the regexp.

echo "fooŧ.no baŧ.no" | perl -ne 'if(/([a-zŧ]{2,63}\.no)/ig) { print $1,"\n"; }'

If we replace if with while we will print any match found in the whole line.

echo "fooŧ.no baŧ.no" | perl -ne 'while(/([a-zŧ]{2,63}\.no)/ig) { print $1,"\n"; }'

Because I’m afraid the regexp (specifically the non-ASCII characters) may be mangled by being saved and moved between systems, I want to write the Norwegian and Sami characters using their Unicode code points. Perl has support for this using \x{<number>} (see perl unicode)

echo "fooŧ.no baŧ.no" | perl -CSD -ne 'while(/([a-z\x{167}]{2,63}\.no)/ig) { print $1,"\n"; }'

When using code points, I have to specify -CSD for the matching to work. I am not really sure why this is required. If you can explain, please comment or tell my by other means. As you can read in perlrun, -CSD specifies that STDIN, STDOUT, STDERR and all input and output streams should be treated as being UTF-8.

Another problem is that if this last solution is is fed invalid UTF-8, it will die fatally and stop processing input.

Malformed UTF-8 character (fatal) at -e line 1, <> line X.

To prevent this happening I currently sanitize my dirty input using iconv -f utf-8 -t utf-8 -c. If you have a better solution for this, Perl or otherwise, please tell me!.

A simple regexp would match the valid characters for a length between 2 and 63 followed by .no. However, I wanted only and all “domains under .no” as counted by Norid in their statistics. Norids definition of “domains under .no” are all the domains directly under .no, but also domains under category domains i.e. ohv.oslo.no and ola.priv.no. To get comparable results, I have to collect both *.no and *.<category domain>.no domains when scraping data.

The resulting “oneliner” I use is this…. It once was a oneliner, but with more than 10k characters in the regexp it was hard to manage. The resulting script builds up a regexp that is valid for all Norwegian domains using a list of valid category domains, all valid characters and other rules for .no domains.

November 12, 2016 10:00 PM

July 15, 2016

Mimes brønn

Hvem har drukket fra Mimes brønn?

Mimes brønn har nå vært oppe i rundt et år. Derfor vi tenkte det kunne være interessant å få en kortfattet statistikk om hvordan tjenesten er blitt brukt.

I begynnelsen av juli 2016 hadde Mimes brønn 71 registrerte brukere som hadde sendt ut 120 innsynshenvendelser, hvorav 62 (52%) var vellykkede, 19 (16%) delvis vellykket, 14 (12%) avslått, 10 (8%) fikk svar at organet ikke hadde informasjonen, og 12 henvendelser (10%; 6 fra 2016, 6 fra 2015) fortsatt var ubesvarte. Et fåtall (3) av hendvendelsene kunne ikke kategoriseres. Vi ser derfor at rundt to tredjedeler av henvendelsene var vellykkede, helt eller delvis. Det er bra!

Tiden det tar før organet først sender svar varierer mye, fra samme dag (noen henvendelser sendt til Utlendingsnemnda, Statens vegvesen, Økokrim, Mediatilsynet, Datatilsynet, Brønnøysundregistrene), opp til 6 måneder (Ballangen kommune) eller lenger (Stortinget, Olje- og energidepartementet, Justis- og beredskapsdepartementet, UDI – Utlendingsdirektoratet, og SSB har mottatt innsynshenvendelser som fortsatt er ubesvarte). Gjennomsnittstiden her var et par uker (med unntak av de 12 tilfellene der det ikke har kommet noe svar). Det følger av offentlighetsloven § 29 første ledd at henvendelser om innsyn i forvaltningens dokumenter skal besvares «uten ugrunnet opphold», noe som ifølge Sivilombudsmannen i de fleste tilfeller skal fortolkes som «samme dag eller i alle fall i løpet av 1-3 virkedager». Så her er det rom for forbedring.

Klageretten (offentleglova § 32) ble benyttet i 20 av innsynshenvendelsene. I de fleste (15; 75%) av tilfellene førte klagen til at henvendelsen ble vellykket. Gjennomsnittstiden for å få svar på klagen var en måned (med unntak av 2 tillfeller, klager sendt til Statens vegvesen og Ruter AS, der det ikke har kommet noe svar). Det er vel verdt å klage, og helt gratis! Sivilombudsmannen har uttalt at 2-3 uker ligger over det som er akseptabel saksbehandlingstid for klager.

Flest henvendelser var blitt sendt til Utenriksdepartementet (9), tett etterfulgt av Fredrikstad kommune og Brønnøysundregistrene. I alt ble henvendelser sendt til 60 offentlige myndigheter, hvorav 27 ble tilsendt to eller flere. Det står over 3700 myndigheter i databasen til Mimes brønn. De fleste av dem har dermed til gode å motta en innsynshenvendelse via tjenesten.

Når vi ser på hva slags informasjon folk har bedt om, ser vi et bredt spekter av interesser; alt fra kommunens parkeringsplasser, reiseregninger der statens satser for overnatting er oversteget, korrespondanse om asylmottak og forhandlinger om toppdomenet .bv, til dokumenter om Myanmar.

Myndighetene gjør alle mulige slags ting. Noe av det gjøres dårlig, noe gjør de bra. Jo mer vi finner ut om hvordan  myndighetene fungerer, jo større mulighet har vi til å foreslå forbedringer på det som fungerer dårlig… og applaudere det som  bra.  Er det noe du vil ha innsyn i, så er det bare å klikke på https://www.mimesbronn.no/ og så er du i gang 🙂

by Mimes Brønn atJuly 15, 2016 03:56 PM

June 01, 2016

Kevin Brubeck Unhammer

Maskinomsetjing vs NTNU-eksaminator

Twitter-brukaren @IngeborgSteine fekk nyleg ein del merksemd då ho tvitra eit bilete av nynorskutgåva av økonomieksamenen sin ved NTNU:

Dette var min økonomieksamen på "nynorsk". #nynorsk #noregsmållag #kvaialledagar https://t.co/RjCKSU2Fyg
Ingeborg Steine (@IngeborgSteine) May 30, 2016

Kreative nyvinningar som *kvisleis og alle dialektformene og arkaismane ville vore usannsynlege å få i ei maskinomsett utgåve, så då lurte eg på kor mykje betre/verre det hadde blitt om eksaminatoren rett og slett hadde brukt Apertium i staden? Ingeborg Steine var så hjelpsam at ho la ut bokmålsutgåva, så då får me prøva 🙂


Ingen kvisleis og fritt for tær og fyr, men det er heller ikkje perfekt: Visse ord manglar frå ordbøkene og får dermed feil bøying, teller blir tolka som substantiv, ein anna maskin har feil bøying på førsteordet (det mangla ein regel der) og at blir ein stad tolka som adverb (som fører til det forunderlege fragmentet det verta at anteke tilvarande). I tillegg blir språket gjenkjent som tatarisk av nettsida, så det var kanskje litt tung norsk? 🙂 Men desse feila er ikkje spesielt vanskelege å retta på – utviklingsutgåva av Apertium gir no:


Det er enno eit par småting som kunne vore retta, men det er allereie betre enn dei fleste eksamenane eg fekk utdelt ved UiO …

by unhammer atJune 01, 2016 09:45 AM

April 02, 2016

Thomas Sødring

Choices made on the road to 0.1

You can drive yourself mad wondering if you made the right choice with regards to technology. This really is a difficult question to answer as you have to pick components that have longevity and that are in widespread use. The truth is that you just have to pick something and go with it. I think about popularity of libraries, how active development is etc before I make a choice but it’s not easy to just decide. Any component I use now will follow the project and code for a long time going forward.

This week I was wrestling with AngularJS and ReactJS. Basically it boils down to whether or not  I go with Google or Facebook. I picked up some cheap courses on Angular and that kinda made that decision. I’m not really that bothered by the GUI side of things at the moment, but I do need an administrative GUI and would like to have an idea how a proof-of-concept GUI would look. Given that this is a REST service, it will be possible to swap Angular out with whatever you want anyway. It is a very time consuming process trying to figure these things out.

The last month has been spent wondering how I should structure the project. If I get the foundation wrong, it will have a negative effect on the project.  Baeldung has an interesting project structure with a clean definition of modules and what should be within the modules. This quickly became the basis of my project structure. I kept coming across jhipster and after days of hassle (installing npm, bower, yo) getting it installed I managed to get an interesting project setup. What I learnt from the jhipster sample app was support for swagger, metrics, spring-security, angulaJS and yaml project configuration. I was initially unable to get the jhipster app to run so I have spent the time studying the code and structure and gradually copied elements over to my project. This has resulted in the nikita code base supporting swagger, the introduction of metrics support, and spring security  user configuration, all copied from  the jhipster sample application.

This approach has really saved me a lot of time and answered many questions about spring and spring-based applications. I have learnt so much from this approach. A plus with such an approach where I try to study best practices is that I hopefully will end up with a good project structure and robust code. A negative is that I’m learning as I go along. Ideally I’d sit down and figure everything out in advance but I think that’s the primary reason why it has been difficult to move this project forward over the last couple of years, I never had my own concrete project structure to work with and was unsure how to proceed.

I also switched coding from Eclipse to Eclipse STS to IntelliJ Idea.  I never seemed to be able to get things working nicely in Eclipse and STS. I always ended up with issues like it wasn’t possible to find   source code when debugging  or download sources and documentation didn’t work properly. I spent a lot of time on stackexchange but it really felt like a waste of time and I didn’t have an environment I felt comfortable and productive in. Idea has been a dream to work with. It just does things intuitively and the integration with git has allowed me to push code and changes quickly to githhub. I have never been so impressed with an IDE as I have been with Idea. It just seems to make sense.

I was also able to confirm that OData support is still in the draft version of Noark 5 v4 and will more than likely be in the final version. This complicates development of the REST-service significantly but I think I will solve this in the  codebase by supporting two apis, one with OData and one without. The reason for this is that OData support requires me to handle all incoming HTTP requests manually. To be honest I am unsure about the usefulness of  OData in a running installation, but if the standard specifies it then we simply will have to implement it. There is very little REST OData support in the java ecosystem, but there is something available that we can use.

Currently the code is very much a pre alpha version of v0.1. It  is mainly a working  project structure with the above mentioned libraries and  with the domain model copied in and the fonds object is accessible via a REST controller. Don’t expect the code to work until it hits the v0.1 mark as I am updating it continuously. You can check out the code from the github repository.

by tsodring atApril 02, 2016 05:38 AM

April 01, 2016

Thomas Sødring

Current project structure

One of the main challenges with this project is that I am not in a position to work on it full time. In the last month I have probably spent 80 hours and half of that is coming from my own free time. So whatever time I do have has to be spent wisely. I have few days to thoroughly explore issues and threads of thoughts are split up over several days.

In the last month I have made some interesting progress. I have spent the time working on the project structure and have moved files around quite a lot.

Currently the project is a multi module maven project with the following modules

core-client is where most of the domain modelling of Noark 5 can be found. All persistence related objects are here, DTO’s etc.

core-common contains a lot of common functionality related to REST handling etc. This is code that could be reused in other Noark 5 REST related projects.

core-conversion will be a REST-service that can convert documents from a production format to an archive format. I will only implement integration to LibreOffice but it is easily imaginable to implement integration to MS Office. I haven’t started this yet.

core-extraction will be a standalone executable jar that can extract the contents of the core in accordance with the extraction rules. Currently a weak arkivstruktur.xml generator has been implemented and that’s just to show a proof-of-concept.

core-webapp is the actual web application that is a spring-boot application and starts up a REST service.

Another module that needs to be implemented will be core-postjournal that talks to the database and publishes postjournal  in various formats. Integration with altinn and digipost etc (core-dispatcher) all are obvious candidates for work, but currently the project needs a clear defined roadmap so these can all come later.

All the modules are encapsulated inside a parent module called nikita-noark5-core.

by tsodring atApril 01, 2016 01:44 PM

October 18, 2015

Anders Nordby

Fighting spam with SpamAssassin, procmail and greylisting

On my private server we use a number of measures to stop and prevent spam from arriving in the users inboxes: - postgrey (greylisting) to delay arrival (hopefully block lists will be up to date in time to stop unwanted mail, also some senders do not retry) - SpamAssasin to block mails by scoring different aspects of the emails. Newer versions of it has URIBL (domain based, for links in the emails) in addtition to the tradional RBL (IP based) block lists. Which works better. I also created my own URIBL block list which you can use, dbl.fupp.net. - Procmail. For user on my server, I recommend this procmail rule: :0 * ^X-Spam-Status: Yes .crapbox/ It will sort emails that has a score indicating it is spam into mailbox "crapbox". - blocking unwanted and dangerous attachments, particularly for Windows users.

by Anders (noreply@blogger.com) atOctober 18, 2015 01:09 PM

April 23, 2015

Kevin Brubeck Unhammer


I førre innlegg i denne serien gjekk eg kort gjennom ymse metodar for å generera omsetjingskandidatar til tospråklege ordbøker; i dette innlegget skal eg gå litt meir inn på kandidatgenerering ved omsetjing av enkeltdelane av samansette ord. Me har som nemnt allereie ei ordbok mellom bokmål og nordsamisk, som me vil utvida til bokmål–lulesamisk og bokmål–sørsamisk. Og ordboka blei utvikla for å omsetja typisk «departementsspråk», så ho er full av lange, samansette ord. Og på samisk kan me setja saman ord omtrent på same måte som på norsk (i tillegg til ein haug med andre måtar, men det hoppar me glatt over for no). Dette bør me kunna utnytta, sånn at viss me veit kva «klage» er på lulesamisk, og me veit kva «frist» er, så har me iallfall éin fornuftig hypotese for kva «klagefrist» kan vera på lulesamisk 🙂

Orddeling er flott når du skal omsetja ordbøker. Særskrivingsfeil er flott når du vil smila litt.
«Ananássasuorma» jali «ananássa riŋŋgu»? Ij le buorre diehtet.

Altså kan me bruka dei få omsetjingane me allereie har mellom bokmål og lulesamisk/sørsamisk til å laga fleire omsetjingar, ved å omsetja deler av ord, og så setja dei saman igjen. Me har òg eit par omsetjingar liggande mellom nordsamisk og lulesamisk/sørsamisk, så me kan bruka same metoden der (og utnytta det at me har ei bokmål–nordsamisk-ordbok til å slutta riŋgen tilbake til bokmål).

Dekning og presisjon

Dessverre (i denne samanhengen) har me òg ofte fleire omsetjingar av kvart ord; i dei eksisterande bokmål–lulesamisk-ordbøkene me ser på (i stor grad basert på ordboka til Anders Kintel) står det at «klage» kan vera mellom anna gujdalvis, gujddim, luodjom eller kritihkka, medan «frist» kan vera  ájggemierre, giehtadaláduvvat, mierreduvvam eller ájggemærráj. Viss me tillet kvar venstredel å gå med kvar høgredel, får me 16 moglege kandidatar for dette eine ordet! Sannsynlegvis er ikkje meir enn ein eller to av dei brukande (og kanskje ikkje det ein gong). I snitt får me rundt dobbelt så mange kandidatar som kjeldeord med denne metoden. Så me bør finna metodar for å kutta ned på dårlege kandidatar.

Den komplementære utfordringa er å få god nok dekning. Av og til ser me at me ikkje har ei omsetjing av delane av ordet, sjølv om me har omsetjingar av ord med dei same delene i seg. Den setninga krev nok eit døme 🙂 Me vil gjerne ha ein kandidat for ordet «øyekatarr» på lulesamisk, altså samansetjinga «øye+katarr». Me har kanskje ei omsetjing for «øye» i materialet vårt, men ingenting for «katarr». Derimot står det at «blærekatarr» er gådtjåráhkkovuolssje. Så for å utvida dekninga, kan me i tillegg dela opp kjeldematerialet vårt i alle par av samansetjingsdelar; viss me veit at desse orda kan analyserast som «blære+katarr» og gådtjåráhkko+vuolssje, så kan det jo synast som at «blære» er gådtjåráhkko og «katarr» er vuolssje (og Giellatekno har heldigvis gode morfologiske analysatorar som fint deler opp slike ord på rette staden). Og dette gir ei god utviding av materialet – faktisk får me kandidatar for nesten dobbelt så mange av dei orda som me ønsker kandidatar for, viss me utvidar kjeldematerialet på denne måten. Men det har ei stor ulempe òg: Me får over dobbelt så mange lule-/sørsamiske kandidatar per bokmålsord (i snitt rundt fire kandidatar per kjeldeord).

Filtrering og rangering

Me vil innskrenka dei moglege kandidatane til dei som mest sannsynleg er gode. Den beste testen er å sjå om kandidaten finst i korpus, og då helst i same parallellstilte setning (dette er oftast ein bra kandidat). Viss ikkje, så kan me òg sjå på om kandidaten og kjeldeordet har liknande frekvensar, eller om kandidaten har frekvens i det heile.

Orddelingsomsetjinga foreslo tsavtshvierhtie for «virkemiddel», og der stod dei i ein parallellsetning òg:
<s xml:lang="sma" id="2060"/>Daesnie FoU akte vihkeles tsavtshvierhtie .
<s xml:lang="nob" id="2060"/>Her er FoU er et viktig virkemiddel .

– då er det nok eit godt ordpar.

Uheldigvis har me så lite tekstgrunnlag for lule-/sørsamisk at me fort går tom for kandidatar med frekvens i det heile. For sørsamisk har me t.d. berre kandidatar med korpustreff for rundt 10 % av orda me lagar kandidatar for.

Ein annan test, som fungerer på alle ord, er å sjå om det får analyse av dei morfologiske analysatorane våre; viss ikkje (og viss det i tillegg ikkje har korpustreff) er det oftast feil. Men dette fjernar berre rundt 1/4 av kandidatane; med den oppdelte ordboka vår (kor me òg har med par av delar av ord) har me enno i snitt rundt tre kandidatar per kjeldeord.

(Ein test som eg prøvde, men avslo, var filtrering basert på liknande ordlengd. Det verkar jo logisk at lange ord blir omsett til lange og korte til korte, men det finst mange gode unntak. I tillegg fjernar det alt for få dårlege kandidatar til at det ser ut til å vera verdt det.)

Det parallelle korpusmaterialet vårt er altfor lite, men når me skal generera kandidatar til ordbøker så er det jo ikkje parallelle setningar me prøver å predikera, men parallelle ord og ordbokspar. Og då er jo læringsgrunnlaget vårt eigentleg dei eksisterande ordbøkene våre … Derfor prøvde eg å sjå på kva for samansetjingsdelar som faktisk var brukt i dei tidlegare omsetjingane våre, og kva for par av delar som ofte opptredde i tidlegare omsetjingar, og kva for delar som sjeldan eller aldri gjorde det. Til dømes har den oppdelte ordboka vår for bokmål–lulesamisk desse para:

Her ser me at «løyve» anten kan vera loahpádus eller doajmmaloahpe – skal «taxiløyve» då vera táksiloahpádus eller táksidoajmmaloahpe? På bakgrunn av dette materialet bør me nok satsa på det første – sjølv om doajmmaloahpe står oppført, så er det berre loahpádus som opptrer i samansette ord.

Då kan me prøva å generera kandidatar for alle bokmålsorda i materialet vårt, både dei me eigentleg er ute etter å finna kandidatar for, og dei me allereie har omsetjingar for. Gå så gjennom dei genererte kandidatane for dei orda me allereie har omsetjingar for, og tel opp dei para av orddelar som genererte slike ord. Me har kanskje laga kandidatane barggo+loahpádus og barggo+dajmmaloahpe for «arbeids+løyve»; når me så går gjennom dei eksisterande omsetjingane og finn at «arbeidsløyve» stod i ordboka med omsetjinga barggoloahpádus, så aukar me frekvensen til paret «løyve»–loahpádus med éin, medan «løyve»–dajmmaloahpe blir verande null.

For no har berre filtrert ut dei kandidatane kor paret til anten første- eller andreledd hadde nullfrekvens. I følgje litt manuell evaluering frå ein lingvist er det omtrent berre dårlege ord som blir kasta ut, så det filteret ser ut til å fungera bra. På den andre sida blir berre rundt 10 % av kandidatane fjerna viss me berre hiv ut dei med nullfrekvens, så neste steg blir å bruka frekvensane til å få ei full rangering.

Viss alle ord kunne delast i nøyaktig to delar, så ville det kanskje vore nok å telja opp par av delar og enkeltdelar for å estimera sannsyn, altså f(s,t)/f(s).  Men av og til kan ord delast på fleire måtar, til dømes kan me sjå på «sommersiidastyre» som «sommer+siidastyre» eller «sommersiida+styre» (eg har valt å halda meg til todelingar av ord, for å unngå for mange alternative kandidatar). Viss omsetjinga er giessesijddastivrra, med analysane giesse+sijddastivrra eller giessesijdda+stivrra, så har me ikkje utan vidare nokon grunn til å velja den eine over den andre (vel, me har lengd i dette tilfellet, men det gjeld ikkje i alle slike døme, og me kan ha par av analysar som er 2–3 eller 3–2). Då kan me heller ikkje seia kva for par av orddelar (s,t) me skal auka når me ser «sommersiidastyre»–giessesijddastivrra i treningsmaterialet. Men viss me i tillegg ser «styre»–stivvra ein annan stad, så har me plutseleg eit grunnlag til å ta ei avgjerd. Metodar som Expectation Maximization kan kombinera relaterte frekvensar på denne måten for å finna fram til gode estimat, men eg har ikkje komme så langt at eg har fått implementert dette enno.

by unhammer atApril 23, 2015 06:11 PM

April 14, 2015

NUUG events video archive


April 14, 2015 11:13 AM

February 12, 2015

Salve J. Nilsen

On Bandwagonbuilders and Bandwagoneers

<Farnsworth>Good news, everyone!</Farnsworth>

The League of Bandwagonbuilders have spoken – Perl 6 is likely to be “production ready” sometime in 2015! This means it’s time for the Bandwagoneers to start preparing.

Bandwagoneer – that’s you and me, although you may call yourself something different. Perl Monger. Perl Enthusiast. Or just someone who has realized that all volunteer-based Open Source communities need people who care about making stuff happen in meatspace.

At Oslo.pm (I’m a board member there), we’re doing exactly that. We’re Bandwagoneers, spending some of our own valuable time showing others where the cool stuff is, and showing them how to get it. Here’s some of what we’re up to:

Also worth mentioning; a few weeks ago we also had an introduction to Perl 6’s Foreign Function Interface (called NativeCall), courtesy Arne SkjĂŚrholt. It was quite useful, and I hear Arne’s happy to accept invitations from Perl Monger groups to come visit and give the same presentation. :)

Enough bragging, already!

Being a Bandwagoneer means it’s your task to make stuff happen. There are many ways to do it, and I hope you can find some inspiration in what Oslo.pm is doing. Maybe get in touch with some of the Bandwagonbuilders in #perl6 on irc.freenode.org, and ask if anyone there would like to visit your group? I think that would be cool.

Get cracking! đŸ˜€

by sjn atFebruary 12, 2015 11:20 PM

January 06, 2015


NSA-proof SSH

ssh-pictureOne of the biggest takeaways from 31C3 and the most recent Snowden-leaked NSA documents is that a lot of SSH stuff is .. broken.

I’m not surprised, but then again I never am when it comes to this paranoia stuff. However, I do run a ton of SSH in production and know a lot of people that do. Are we all fucked? Well, almost, but not really.

Unfortunately most of what Stribika writes about the “Secure Secure Shell” doesn’t work for old production versions of SSH. The cliff notes for us real-world people, who will realistically be running SSH 5.9p1 for years is hidden in the bettercrypto.org repo.

Edit your /etc/ssh/sshd_config:

Ciphers aes256-ctr,aes192-ctr,aes128-ctr
MACs hmac-sha2-512,hmac-sha2-256,hmac-ripemd160
KexAlgorithms diffie-hellman-group-exchange-sha256

Basically the nice and forward secure aes-*-gcm chacha20-poly1305 ciphers, the curve25519-sha256 Kex algorithm and Encrypt-Then-MAC message authentication modes are not available to those of us stuck in the early 2000s. That’s right, provably NSA-proof stuff not supported. Upgrading at this point makes sense.

Still, we can harden SSH, so go into /etc/ssh/moduli and delete all the moduli that have 5th column < 2048, and disable ECDSA host keys:

cd /etc/ssh
mkdir -p broken
mv moduli ssh_host_dsa_key* ssh_host_ecdsa_key* ssh_host_key* broken
awk '{ if ($5 > 2048){ print } }' broken/moduli > moduli
# create broken links to force SSH not to regenerate broken keys
ln -s ssh_host_ecdsa_key ssh_host_ecdsa_key
ln -s ssh_host_dsa_key ssh_host_dsa_key
ln -s ssh_host_key ssh_host_key

Your clients, which hopefully have more recent versions of SSH, could have the following settings in /etc/ssh/ssh_config or .ssh/config:

Host all-old-servers

    Ciphers aes256-gcm@openssh.com,aes128-gcm@openssh.com,chacha20-poly1305@openssh.com,aes256-ctr,aes192-ctr,aes128-ctr
    MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-ripemd160-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-512,hmac-ripemd160
    KexAlgorithms curve25519-sha256@libssh.org,diffie-hellman-group-exchange-sha256

Note: Sadly, the -ctr ciphers do not provide forward security and hmac-ripemd160 isn’t the strongest MAC. But if you disable these, there are plenty of places you won’t be able to connect to. Upgrade your servers to get rid of these poor auth methods!

Handily, I have made a little script to do all this and more, which you can find in my Gone distribution.

There, done.

sshh obama

Updated Jan 6th to highlight the problems of not upgrading SSH.
Updated Jan 22nd to note CTR mode isn’t any worse.
Go learn about COMSEC if you didn’t get trolled by the title.

by kacper atJanuary 06, 2015 04:33 PM

December 08, 2014


sound sound


Recently I been doing some video editing.. less editing than tweaking my system tho.
If you want your jack output to speak with Kdenlive, a most excellent video editing suite,
and output audio in a nice way without choppyness and popping, which I promise you is not nice,
you’ll want to pipe it through pulseaudio because the alsa to jack stuff doesn’t do well with phonom, at least not on this convoluted setup.

Remember, to get that setup to work, ALSA pipes to jack with the pcm.jack { type jack .. thing, and remove the alsa to pulseaudio stupidity at /usr/share/alsa/alsa.conf.d/50-pulseaudio.conf

So, once that’s in place, it won’t play even though Pulse found your Jack because your clients are defaulting out on some ALSA device… this is when you change /etc/pulse/client.conf and set default-sink = jack_out.

by kacper atDecember 08, 2014 12:18 AM

November 18, 2014

Anders Einar Hilden

Changing the Subnet Mask in Vmware Workstation on Debian Jessie

I’m currently attending SANS SEC504: Hacker Tools, Techniques, Exploits and Incident Handling in London. For some of the labs in the course we need machines on the IPs and with a subnet mask of

Changing the Subnet Mask for the NAT or host-only networks in VMware Workstation seems like such a easy thing to do. According to VMware it should be as easy as opening the Virtual Network Editor and “type a new value in the Subnet mask text box”.

Oh wait … I can’t change it. The field for subnet mask in the Virtual Network Editor is not editable.

VMware Virtual Network Editor: the field for subnet mask is not editable

Let’s keep googling - plenty matches, but everyone keeps insisting it can be changed in the GUI, or mixes the subnet mask with the subnet IP. Some posts blame permission, but since the Virtual Network Editor always runs as root, that’s not the problem. There are no listings for the vmnets in /etc/network/interfaces or /etc/network/interfaces.d/ and changing the subnet mask in NetworkManager does nothing.

After a lot of thinking (and just after I checked /etc/network/interfaces ) I found /etc/vmware/networking - BINGO! This looks like just the file we were looking for.

Before editing the file we should stop any vmware-related services that might use these files.

$ sudo service vm<TAB>
vmamqpd vmware vmware-USBArbitrator vmware-workstation-server

I’m not sure witch of these services use the files we are editing, so we’ll stop them all

$ sudo service vmamqpd
$ sudo service vmware
$ sudo service vmware-USBArbitrator
$ sudo service vmware-workstation-server

For the SANS course I have set up a new host-only network vmnet2. Since we are using static IPs, and will be running malware on these systems, I have disabled DHCP and not connected a host virtual adapter. The shared folder option Map as a network drive in Windows guests still work, don’t ask me how. Below is the configuration for vmnet2 with a subnet mask of

answer VNET_2_DHCP no
answer VNET_2_DHCP_CFG_HASH E9892EF1006EBB5D4996DF1A377B10EB0D542B94

Success! (but continue reading, we update the DHCP configuration below the picture)

VMware Virtual Network Editor: the uneditable field contains the subnet mask we wanted

VMware stores DHCP config and leases in /etc/vmware/vmnet<NUM>/dhcpd/. If we have changed the subnet IP, subnet mask, or turned on or off DHCP, these files need to be updated. The configfile contains autogenerated information surronded by “DO NOT MODIFY SECTION”, so we should probably not edit it manually.

If we open VMware Virtual Network Editor (sudo vmware-netcfg), change a setting (e.g. the subnet IP from to, save, and then change it back again, VMware will update the files for us.

November 18, 2014 02:50 PM

February 24, 2013

Bjørn Venn

Chromebook; a real cloud computer – but will it work in the clouds?

Lyst på én? Den er ikke i salg i Norge enda, men du kan kjøpe den på Amazon. Les her hvordan jeg kjøpte min på Amazon (bla litt nedover på siden). Med norsk moms, levert til Rimi-butikken 100 meter fra der jeg bor, kom den på 1.850 kroner. Det er den så absolutt verdt:)

by Bjorn Venn atFebruary 24, 2013 07:34 PM

February 22, 2013

Bjørn Venn

Hvem klarer å skaffe meg en slik før påske?

Chromebook pixel

Den nye Chromebook-en til Google, Chromebook Pixel. Foreløbig kun i salg i USA og UK via Google Play og BestBuy.

Verden er urettferdig:)

by Bjorn Venn atFebruary 22, 2013 12:44 PM

January 07, 2013

NUUG events video archive

Utfordringer innen identitetshåndtering og autentisering

Dag-Erling Smørgrav påpeker hvordan Unix har et autentiserings-paradigme som ikke har endret seg på 40 år, mens det har skjedd store utviklinger de siste årene på denne fronten.

January 07, 2013 10:00 PM

May 29, 2012

Salve J. Nilsen

Inviting to the Moving to Moose Hackathon 2012

Oslo Perl Mongers are organizing a hackathon for everyone who would like to dive deep into the details of Moose! We have invited the #p5-mop crowd to work on getting a proper Meta Object Protocol into Perl core, and we’ve invited the #perlrdf crowd to come and convert the Perl RDF toolchain to Moose.

You’re welcome to join us!

Special rebate for members of the Perl and CPAN communities

If you’re working on a project that is considering on moving to Moose then you’re especially welcome! We have a set of promo codes you can use when signing up for the hackathon. Please get in touch with us (or some of the existing participants) to get your promo code and a significant rebate!

Commercial tickets available

Would you like to support the hackathon, but don’t have access to a sponsorship budget? Does your company plan on using Moose, and sees the value of having excellent contacts in the open source communities around this technology? For you, we have a limited amount of commercial tickets. Please check out the hackathon participation page for details.

Sponsorship opportunities

The hackathon is already well sponsored, but there is room for more! If you want to support the us, please contact the organizers as soon as possible!

Who can come?

In short: Everyone who cares about Moose and object-oriented programming in Perl! We’re trying to make the Perl community better by hacking on the stuff that makes the biggest difference (at least in our eyes ;)). If you agree, you’re very welcome to join us! Check out the event site for details, and get in touch with us on IRC if you’re interested.

And finally, keep in mind Oslo.pm’s 2-point plan:

  1. Do something cool
  2. Tell about it!

See you in Stavanger?

by sjn atMay 29, 2012 02:44 PM

October 31, 2011

Anders Nordby

Taile wtmp-logg i 64-bit Linux med Perl?

Jeg liker å la ting skje hendelsesbasert, og har i den forbindelse lagd et script for å rsynce innhold etter opplasting med FTP. Jeg tailer da wtmp-loggen med Perl, og starter sync når brukeren er eller har blitt logget ut (kort idle timeout). Å taile wtmp i FreeBSD var noe jeg for lenge siden fant et fungerende eksempel på nettet:
$typedef = 'A8 A16 A16 L'; $sizeof = length pack($typedef, () ); while ( read(WTMP, $buffer, $sizeof) == $sizeof ) { ($line, $user, $host, $time) = unpack($typedef, $buffer); # Gjør hva du vil med disse verdiene her }
FreeBSD bruker altså bare verdiene line (ut_line), user (ut_name), host (ut_host) og time (ut_time), jfr. utmp.h. Linux (x64, hvem bryr seg om 32-bit?) derimot, lagrer en hel del mer i wtmp-loggen, og etter en del Googling, prøving/feiling og kikking i bits/utmp.h kom jeg frem til:
$typedef = "s x2 i A32 A4 A32 A256 s2 l i2 i4 A20"; $sizeof = length pack($typedef, () ); while ( read(WTMP, $buffer, $sizeof) == $sizeof ) { ($type, $pid, $line, $id, $user, $host, $term, $exit, $session, $sec, $usec, $addr, $unused) = unpack($typedef, $buffer); # Gjør hva du vil med disse verdiene her }
Som bare funker, flott altså. Da ser jeg i sanntid brukere som logger på og av, og kan ta handlinger basert på dette.

by Anders (noreply@blogger.com) atOctober 31, 2011 07:37 PM

A complete feed is available in any of your favourite syndication formats linked by the buttons below.

[RSS 1.0 Feed] [RSS 2.0 Feed] [Atom Feed] [FOAF Subscriptions] [OPML Subscriptions]