<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Liuw's Thinkpad</title><link href="http://blog.liuw.name/" rel="alternate"></link><link href="http://blog.liuw.name/feeds/all.atom.xml" rel="self"></link><id>http://blog.liuw.name/</id><updated>2016-03-25T17:16:00+00:00</updated><entry><title>Factors for Sustaining an Open Source Project</title><link href="http://blog.liuw.name/factors-for-sustaining-an-open-source-project.html" rel="alternate"></link><updated>2016-03-25T17:16:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-03-25:factors-for-sustaining-an-open-source-project.html</id><summary type="html">&lt;p&gt;When I was young and naive, working on open source project was just
writing code to me. Nowadays, it means much more than that. No doubt,
there are many geniuses in the world, who create so many amazing open
source software projects, but to sustain those projects going forward,
I think we can't rely merely on raw talent. To sustain a project going
forward, there are quite a few intertwining factors -- hence the order
I mention them doesn't really matter.&lt;/p&gt;
&lt;p&gt;The first factor is acquiring new talents, building a pool of people
that have the skill and willingness to work on a particular
project. When a project is young and exciting, it's probably not too
hard. But when a project becomes mature, it might be perceived as
uncool and harder to get new talents.&lt;/p&gt;
&lt;p&gt;The second factor is having good governance. This is easier said than
done. Having any governance is better than no governance, having
bad governance is detrimental to project in the long run. A good
governance is crucial to enlarging community -- hence important to
both acquiring new talents and users. For open source software this
probably means clear document on how the project is run, the path to
reach the top for the ambitious, clear guide on how to contribute etc.&lt;/p&gt;
&lt;p&gt;The third factor is commercial interest. It's absolutely not a shame
to make money from open source software. On the contrary, I would be
very happy to see people use open source software to make money in
ethical ways. On individual level, people need to pay their bills
after all. On organisational level, a company needs return for its
investment.&lt;/p&gt;
&lt;p&gt;The fourth factor is tooling. That includes communication channels,
review tools, infrastructure and so on and so forth -- anything that
would affect how people cooperate.&lt;/p&gt;
&lt;p&gt;The fifth factor is engaging with upstream and downstream. Work with
them, don't suddenly introduce intrusive / controversial changes or
incompatible breakages. Be nice to them and they will be nice to you.&lt;/p&gt;
&lt;p&gt;The project I work for used to suffer quite a bit for lacking
everything I mentioned above. Luckily it survived. But it's not yet
all perfect -- there are quite some annoyances waiting to be fixed in
governance and on boarding new talents. The good thing is that the
community is aware of those road blocks and trying actively to remove
them.&lt;/p&gt;</summary></entry><entry><title>Random Thoughts</title><link href="http://blog.liuw.name/random-thoughts.html" rel="alternate"></link><updated>2016-03-13T15:20:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-03-13:random-thoughts.html</id><summary type="html">&lt;p&gt;I once talked with my friend, who worked in the field of
virtualisation but in a proprietary software company, about work
practice and code quality. His conclusion was that open source
solution was not as well tested and organised as the proprietary one.&lt;/p&gt;
&lt;p&gt;I actually agreed with him, because open source software seemed to be
generally lacking investment. It's next to impossible for our project
to have the equivalent amount of money invested in proprietary
product. While many companies use open source projects for their
products, they either don't have those projects being critical part of
the products or have enough talents to essentially maintain their own
fork if upstream projects go unmaintained.&lt;/p&gt;
&lt;p&gt;Why should companies invest in upstream, if at all? Surely they don't
want to easily give away their technology; on the other hand, they
want useful stuff from upstream (contributed by other
entities). Basically the incentive is to share as little as possible
but gain as much as possible. The only concern is that their own fork
might divert from upstream which then makes pulling in changes
impossible. This is not insurmountable provided they have enough money
to pay for the on going maintainence burden.&lt;/p&gt;
&lt;p&gt;So I think open source software development model would only work if
those companies who contribute to open source software projects are
not directly making money off the software itself. Open source
software can be a core part of their infrastructure (service
provider), can be a basis of their offering (software company * ), can be
a booster for selling their core product (hardware company).&lt;/p&gt;
&lt;p&gt;Needless to say, open source software is important. It's utility
now. As an individual who is enthusiastic in developing open source
software, having relevant experience in the field should be enough to
get myself a job. But then I am a bit pessimistic for open source
software based business model. I surely don't want to start another
utility company now.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Note the line for software companies is a bit blurred -- one could
argue that they are directly making money off open source software,
but I disagree -- because the software is freely (as in beer)
distributed, they can't or don't charge money for it, instead, they
sell support contract.&lt;/li&gt;
&lt;/ul&gt;</summary></entry><entry><title>Rant: Pissed off by Politics</title><link href="http://blog.liuw.name/rant-pissed-off-by-politics.html" rel="alternate"></link><updated>2016-03-06T23:23:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-03-06:rant-pissed-off-by-politics.html</id><summary type="html">&lt;p&gt;Many Linux distributions function on the basis that people around the
world subscribe to its ideology. In such model nobody can impose
anything on anybody else. On a large scale this model works well
enough to keep several major distributions function properly and
produce high quality software. But on a smaller scale, things can go
wrong.&lt;/p&gt;
&lt;p&gt;I have been user of one Linux distribution D for many years. I like
way it is organised. I'm quite satisfied with the quality of the
distribution overall.&lt;/p&gt;
&lt;p&gt;Then there is one program I use almost daily. For the first time I
used it, I found out the version in D was quite old comparing to
upstream version. I sent an email to ask the package maintainer if I
can help upgrade the package to the latest version. But it turned out
the package maintainer was aware of that but he wouldn't want to
upgrade to the latest version because he and the upstream maintainer
held different ideas of how things should be done.&lt;/p&gt;
&lt;p&gt;Over the course of 6 years, things didn't get improved. I wasn't
alone. Many people offered helping hands, but all the effort was
blocked due the the same reason. And to be honest, to this day nobody
really understands what is the real reason that blocks this whole
thing -- it dates back to 16 years ago according to my own
archaeology, and there isn't any public record. The only information
people get is that they had discussions and several proposals many
years ago.&lt;/p&gt;
&lt;p&gt;The higher level reason is that it "doesn't meets the normal distro
expectations and standards". In order to not make myself a fool and
take the wrong side, I check how things are done in other popular
distributions like G, A and F. It seems that they do have the relative
new version and aren't concerned with the blocker mentioned in D's
package.&lt;/p&gt;
&lt;p&gt;At this point I feel frustrated and pissed. On one hand I respect the
package maintainer because he obviously put huge amount of effort into
this package - he went as far as writing a huge patch to did things
the way he saw fit and maintained that patch for years; on the other
hand it looks like he is holding hostage of everybody using D to make
his proposal accepted upstream.&lt;/p&gt;
&lt;p&gt;Politics is inevitable in life. But I don't want to see a
volunteer-based open source project turned into an arena for people
who push for their private agendas while actively harming the interest
of wider community.&lt;/p&gt;
&lt;p&gt;I would either package my own version or write to D mailing list to
take over the project. As of today I've finished a preliminary version
of the new package, with all the cruft removed. I also filed a new bug
report to urge the package maintainer to reconsider upgrading to the
latest upstream version.  I don't like to aggressively take over
things, but for the benefit of larger community, I will do what needs
to be done.&lt;/p&gt;</summary></entry><entry><title>Dabbling in Data Structures and Algorithms Once Again</title><link href="http://blog.liuw.name/dabbling-in-data-structures-and-algorithms-once-again.html" rel="alternate"></link><updated>2016-02-27T01:11:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-02-27:dabbling-in-data-structures-and-algorithms-once-again.html</id><summary type="html">&lt;p&gt;Strange it may seem, I don't think I ever dealt with coding interviews
well. Not that I've never been to one, it is just that I didn't realy
get my job offers by solving puzzles and writing linked list, tree or
graph algorithms.&lt;/p&gt;
&lt;p&gt;When I was in school, I dealt with low level stuff. As system
administrator I mostly wrote scripts. I just needed to understand how
a system worked. I worked on various kernels, but most data structures
and algorithms weren't rocket science there. I studied algorithms
using some of the prestigious books in the field. But I never used
them in real life. I thought data structures and algorithms were fun
and boring at the same time. It was fun because it was like solving
puzzles, boring because I had no use of it. In the end I didn't spend
much time on that subject.&lt;/p&gt;
&lt;p&gt;Then came the job hunting season. It's quite pervasive to ask
algorithm related questions in interviews. I think I did badly in most
of them. But I used my other skills to get some job offers. I
considered myself lucky.&lt;/p&gt;
&lt;p&gt;Nowadays I ponder from time to time what it would be like to dabble in
data structures and algorithms again -- what if I want to change job
at some point? Better keep myself sharp and be ready all the time.  So
I registered an account on one of the online judge services and picked
an easy question.&lt;/p&gt;
&lt;p&gt;The first thing I had to do was to convince myself to leave all the
error handling logic alone. I stopped asking questions like "what if
the memory allocation fails" and "how can I sanitise the input" and
blundered on.&lt;/p&gt;
&lt;p&gt;The second thing that baffled me was the environment. I was presented
with a web based editor. My muscle memory worked against me. Every
time I tried to use Emacs keybindings the editor did something I
didn't expect. My thought was constantly interrupted. I guess I could
have used my own development environment to make things easier.&lt;/p&gt;
&lt;p&gt;Then I sadly discovered that I basically forgot all the relevant bits
on the subject. Due to time constraint, I tried to invent a caching
algorithm for the problem, but only to find it harder and harder to
debug.&lt;/p&gt;
&lt;p&gt;At that point I was quite frustrated. The problem was marked as
"easy". I was either too stupid to even solve an easy problem or
missed something very obvious.&lt;/p&gt;
&lt;p&gt;It turned out I did miss something. There was an O(n) algorithm for
the problem with only a few lines of code. I blinded myself with my
horribly obscured algorithm -- though it was quite close to the
correct one. I had the "aha" moment when I looked at how other people
solved it and wondered why I missed that.&lt;/p&gt;
&lt;p&gt;So my first coming back to the subject was not working well. That is
more or less expected. People get rusty over time. And in my defense I
am not used to interview style coding at all. The next step is to pick
up books and tune myself to get used to the restricted
environment. Hopefully I can reacquaint myself with this subject and
make it useful at some point.&lt;/p&gt;</summary></entry><entry><title>The Internet of Useless Opinions</title><link href="http://blog.liuw.name/the-internet-of-useless-opinions.html" rel="alternate"></link><updated>2016-02-17T00:24:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-02-17:the-internet-of-useless-opinions.html</id><summary type="html">&lt;p&gt;Hacker News is one of the few sites I go to daily because I find
majority of the submissions are of high quality. I seldom comment for
various reasons (maybe I should expound that in another post), but I
do enjoy reading interesting and insightful comments about how things
work, less-well-known history and intellectual debates.&lt;/p&gt;
&lt;p&gt;As with all other Internet forums, there are trolls, fanboys, fanatics
and so on, so I see all sort of opinions. Strangely I never seem to
have problems with opinions even if they seem most estranged,
irrational, absurd or irritating. Those are, after all, opinions. And
in my own humble opinion, the value of opinions from some random
stranger on the Internet is close to zero. And yes, you SHOULD stop
reading now if you don't know me.&lt;/p&gt;
&lt;p&gt;I seem a bit cynical, but in fact I'm pragmatic. Anyone who wants to
steer the world to his or her own direction should be busy doing
things to make that actually happen. Being vocal on the Internet
without actually doing anything is cheap.&lt;/p&gt;
&lt;p&gt;I don't judge people from what they say on the Internet, of
course. But sometimes when I read opinion like "We should do X because
of Y", I do wonder if that person has ever done serious work in the
field. To avoid having a straw man argument, we'd better have a look
at concrete example.&lt;/p&gt;
&lt;p&gt;Whenever a security bug related to memory safety shows up in system
software written in C, there are people who claim "we should expunge
all C code and rewrite everything in memory safe language because C is
unsafe". Technically speaking I agree with them wholeheartedly.  I'm
not being sarcastic here -- I try to use the right language for the
right work in my day job, I've learned more than a dozen of languages
over the years. C is terrible language, we do need better system level
language(s). But basing argument for replacing C just on the language
itself misses the whole picture. C has paramount tooling support, it
has the right level of abstraction to work with a machine, the list
goes on and on and on. Throwing away decades of effort is just
unrealistic. Guess what, people who care have already rolled up their
sleeves and started publishing code. Not sure how much of the code is
written by the most vocal people on the Internet, though. And I
believe if those people who have done the real work are to make the
case, they would have used different argument than just "because C is
bad".&lt;/p&gt;
&lt;p&gt;My wisdom over the years is, people who care are too busy to express
opinions on the Internet; those who are very vocal tend to be doing
disservice to the things they try to promote. All in all, the Internet
is just teeming with useless opinions.&lt;/p&gt;
&lt;p&gt;And I admit this piece is my humble contribution to the pool of
useless opinions on the Internet. Anyone who takes my words seriously
is just wasting his or her time.&lt;/p&gt;</summary></entry><entry><title>Write the Features That are Used Everyday</title><link href="http://blog.liuw.name/write-the-features-that-are-used-everyday.html" rel="alternate"></link><updated>2016-02-12T14:45:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-02-12:write-the-features-that-are-used-everyday.html</id><summary type="html">&lt;p&gt;One of the things I learned during these years is that I should
prioritise my work based on the possibility of that piece of work
getting used by users. The more likely users use it (directly or
indirectly), the more value and less maintenance burden there is.&lt;/p&gt;
&lt;p&gt;The temptation of adding things we think is useful but without
actually validating the idea first is dangerous. Each line of code is
one line of liability. Code gets inevitably bit-rotten when nobody
uses it. I've seen several examples myself, starting with the very
feature I wrote.&lt;/p&gt;
&lt;p&gt;Having no direct input from project managers and end users, it's a bit
hard to imagine what would be useful or not. True, there are features
that everyone thinks important, but many more are on the boarder line
that we can't see immediate return of investment.&lt;/p&gt;
&lt;p&gt;This is not to downplay features that are unclear whether users
actually want but have strategic importance. Software needs to
evolve. Users would like to see shiny new things. And sometimes we
have to weight this factor in.&lt;/p&gt;
&lt;p&gt;The experience of maintaining a piece of software helps me to become a
better developer in that regard.  When having a new project idea, I no
long rush to write the code. I spend more time weighting the
usefulness and the maintenance burden. I think long and hard about the
design. My goal, in the end, is to maximise the return value of my
work -- to write features that get used everyday.&lt;/p&gt;</summary></entry><entry><title>On Opinions</title><link href="http://blog.liuw.name/on-opinions.html" rel="alternate"></link><updated>2016-02-06T13:23:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-02-06:on-opinions.html</id><summary type="html">&lt;p&gt;I am normally a laid-back person. I don't care about many things that
are not related to me. Heck, I don't even care about that many things
that are related to me. I say I have no opinion or no preference all
the time.&lt;/p&gt;
&lt;p&gt;By and large I'm still the same old me. I have grown somewhat
opinionated in certain aspects along the way, however. I don't know
what was the tipping point, what was the last straw to break the
camel's back. Those aren't the important in the grand scheme of
things. What is important is that I realise there is no such thing as
no opinion, there is only good opinion and bad opinion.&lt;/p&gt;
&lt;p&gt;More often than not, when I take on more responsibilities either in
life or in work, I have to form a way of thinking, which leads to
opinions on how things should work. This is not really philosophy of
life. I refuse to use that word that because the level of discussion
in this post doesn't qualify.&lt;/p&gt;
&lt;p&gt;The reason I have become somewhat opinionated lies in the fact that I
have had a lot more interactions with humans that with conflicting
opinions in the past two years at work. Not that I shunned away from
interaction with real humans, but my line of work consisted mostly of
interactions with machines (which could be considered both a bless and
a curse).&lt;/p&gt;
&lt;p&gt;Inevitably, when talking to different people with different opinions,
I form mine. I need to either agree or disagree with them. I can do
neither without my own opinion.&lt;/p&gt;
&lt;p&gt;This is both good and bad. It's good because I have gotten to the
point where I can have consistent view of the big picture and actually
push things forward. It's bad because that blinds me, shutting down a
whole lot of new possibilities.&lt;/p&gt;
&lt;p&gt;And to be honest I become wary of being too entrenched in a position
as I become more and more opinionated. As said, there is only good or
bad opinion. Entrenched myself in a wrong position caused by bad
opinion won't end well. But there is no way to tell accurately which
one is good or which one is bad. In principle I think everybody agrees
a well-thought opinion is better than one that comes up random, so in
reality I try hard to form well-thought opinion. I think that is the
best one can do.&lt;/p&gt;</summary></entry><entry><title>Prepare Myself for "Stupid" Questions</title><link href="http://blog.liuw.name/prepare-myself-for-stupid-questions.html" rel="alternate"></link><updated>2016-01-30T23:40:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-01-30:prepare-myself-for-stupid-questions.html</id><summary type="html">&lt;p&gt;We had a booth at one open source conference with a wide ranges of
open source software users. It is good to talk to end users from time
to time, because when you're isolated for too long from end users you
don't know how they perceive the project.&lt;/p&gt;
&lt;p&gt;We got all sorts of questions. I considered some high quality and some
"stupid". In the end, all questions need to be answered, otherwise we
look stupid ourselves.&lt;/p&gt;
&lt;p&gt;By "stupid questions", I mean the questions that are so wrong that
you don't even know where to start. But the user actually thinks the
question is legit. I think we normally handle high quality questions
very well because both sides know what they are doing. It's the
"stupid" questions that we handle badly from time to time.&lt;/p&gt;
&lt;p&gt;It's not very good tactic to try to educate the user from ground up
and try to rectify their understanding, because we don't have that
much time and the user is not here to receive a lecture. Another
aspect is that by rectifying all the errors you actually make the
other side feel stupid while originally he or she might actually be
feeling good and wishing to engage.&lt;/p&gt;
&lt;p&gt;The best way I can think of thus far is to use analogies. For example,
one user asks, "can product A generate an intermediate medium that can be
consumed be product B because the underlying technology is the same?"
The analogy would be "no, it can't because it would be requiring BSD
binary to run on Linux. The underlying format is ELF on both platforms
but there are things that just don't allow to do that." Use things
user can relate to to make them quickly grasp the idea.&lt;/p&gt;
&lt;p&gt;It does take some effort to improvise analogies on the spot,
though, but as least I have a way to deal with those questions now.&lt;/p&gt;</summary></entry><entry><title>My Stupid Mistake with Software Licensing</title><link href="http://blog.liuw.name/my-stupid-mistake-with-software-licensing.html" rel="alternate"></link><updated>2016-01-22T23:45:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-01-22:my-stupid-mistake-with-software-licensing.html</id><summary type="html">&lt;p&gt;Never thought I would paint myself into a corner like this.&lt;/p&gt;
&lt;p&gt;I work on open source project for a living. I like the idea that my
code is generally useful. I care how my code is being used to a
degree, but I'm definitely not a strong supporter of one particular
franchise of open source camp. Instead, I carefully avoid arguments
about different ideologies and / or definitions about open source /
free software. Those are difficult questions. Though they are very
important, I would rather spend my time somewhere else on more
productive stuff.&lt;/p&gt;
&lt;p&gt;My strategy has served me well. I sometimes read the license of the
projects I contribute to. But all in all, I don't care that much. I
believe most people are just like me, they want to do cool
stuff. What's inside the license file isn't at all the critical to
making a contribution. And to clarify my position, software, open
source or not, it is a mean (however critical it is) to move forward
human civilization, not an end in itself.&lt;/p&gt;
&lt;p&gt;Things changed recently. I got involved with BSD family more
often. That causes quite a bit of headache for me. Everyone knows a
good software engineer should reuse as much code as possible. So when
I try to contribute to BSD, I always think to import some code I write
for Linux. But first and foremost, we need to get the license
straight. Unfortunately there is quite a bit of ambiguity in the
licensing of the code I want to pull in.&lt;/p&gt;
&lt;p&gt;So I discussed with team members about potential issues, spent almost
half an hour with a colleague figuring out what the actual license of
some modules should be, and ventured to write several one-liner
patches to fix them to reflect reality.&lt;/p&gt;
&lt;p&gt;And that, of course, didn't end well. Everyone knew the current
situation is not ideal, but as every other discussions regarding
license, it quickly got derailed into several directions: discussion
whether one particular license exists or not, suggestion that I should
leave it as-is, suggestion I should copy all rights holders (for the
record I didn't agree because I was merely fixing bugs).  I drafted
one reply or two, but in the end I deleted them because I just didn't
think that's going to be productive in any way.&lt;/p&gt;
&lt;p&gt;The cost of any license related patch is too high. Not that writing
a one-liner patch is particularly hard labor. It is the ensuing
endless discussion that makes it so tiresome. No wonder everybody
seems to avoid such topic as hard as they can.&lt;/p&gt;
&lt;p&gt;All in all, I think I made a stupid mistake to even dare writing such
patches. There are a lot of more interesting and pressing problems to
be solved. That's where I should divert my energy to.&lt;/p&gt;
&lt;p&gt;Happy hacking, not happy arguing.&lt;/p&gt;</summary></entry><entry><title>One Post per Week</title><link href="http://blog.liuw.name/one-post-per-week.html" rel="alternate"></link><updated>2016-01-16T22:28:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2016-01-16:one-post-per-week.html</id><summary type="html">&lt;p&gt;I'm now setting a new resolution: I plan to write one blog post per
week.&lt;/p&gt;
&lt;p&gt;I was sitting at the desktop watching random videos on the Internet
this morning, then all of a sudden I felt hollow. I realized I would
have wasted quite a lot of time just watching random videos on the
Internet. Life is too short to do things like that.&lt;/p&gt;
&lt;p&gt;As a matter of fact, I do have a lot of things that I want to write
about. I accumulate topics quite fast because I read articles and
books daily, so I don't think I will run out of ideas to rant about.&lt;/p&gt;
&lt;p&gt;I have accumulated several drafts in my home directory but never got
the motive to finish them. They sort of sit there rotting away until I
eventually don't have the appetite to look at them anymore and delete
them. This is quite bad.&lt;/p&gt;
&lt;p&gt;The rule is simple: topic and length don't matter, just write one
post a week. Rhetoric can be bad, grammar errors are allowed -- heck,
I'm not aiming to become a professional writer or anything.&lt;/p&gt;
&lt;p&gt;Let's see how it goes.&lt;/p&gt;</summary></entry><entry><title>On Interaction with Open Source Communities</title><link href="http://blog.liuw.name/on-interaction-with-open-source-communities.html" rel="alternate"></link><updated>2015-11-26T23:23:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2015-11-26:on-interaction-with-open-source-communities.html</id><summary type="html">&lt;p&gt;During our regular meeting my colleague said he learned a lot during a
trip to one of our vendors, much more than he had expected. One of the
many eye opening learnings is that it is actually quite intimidating
to interact with open source communities.&lt;/p&gt;
&lt;p&gt;It is true that it is not an easy task to work in the public. Sending
emails to a public mailing list is just like public speaking. Even if
I know whatever I say is not going to be used against me in any way, I
have the mental burden that if I make stupid mistakes it's going to be
on public record forever.&lt;/p&gt;
&lt;p&gt;I'm lucky to have time to grow extra thick skin over a long period of
time. I started with small changes and gradually took on larger part
of the project. I also learned how to effectively to not take
criticisms personal and how to focus on the work itself.&lt;/p&gt;
&lt;p&gt;Others might not be so lucky. Imagine I'm hired by some vendor and
assigned to contribute a big feature as my first take on the
project. Surely there will be mistakes and will be picked up by
maintainers and reviewers. If the maintainers are terse and straight,
it's natural that I feel hostility towards me. The more interations I
have, the more frustrated I feel. Eventually I just give up and move
on to other things.&lt;/p&gt;
&lt;p&gt;It's a pity that the openess of a project actually works against
itself. No matter how nice one wants to be, there will be embarassing
moments.&lt;/p&gt;
&lt;p&gt;There is no easy solution. The negative emotional effect is hard to
avoid. Some take it well, others don't. It takes time to educate
people. It takes time for people to change mentality. I am not too
optimistic on this issue getting sovlved any time soon.&lt;/p&gt;</summary></entry><entry><title>Securing My Pet VMs</title><link href="http://blog.liuw.name/securing-my-pet-vms.html" rel="alternate"></link><updated>2015-11-23T23:49:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2015-11-23:securing-my-pet-vms.html</id><summary type="html">&lt;p&gt;I have a number of (0 to 3, depending on conditions) pet VMs running
somewhere in the "cloud".&lt;/p&gt;
&lt;p&gt;I don't want to become a 24x7 system administrator myself. I have far
more important things to do in my day to day life. On the other hand,
I have valuable data out there in those VMs, my secured communication
sometimes depends on them, so I do have incentive to make them as
secured as possible. Internet is a very dangerous place after all.&lt;/p&gt;
&lt;p&gt;Putting myself in the shoe of a malicious actor, I won't have the mood
to penetrate a random server on the Internet with my own human time if
that target is not highly valuable. The attackers are likely to use
some well-known exploits to mass-scan to maximise gain over a set
period of time. With that in mind, most if not all attacks I'm facing
are from script-kiddies. Securing my VMs from attackers on the
Internet is simple -- I just need to follow every security
announcement channel of the software I use (from operating system to
applications) and at the same time apply security best practices. That
should save my arse under most situations.&lt;/p&gt;
&lt;p&gt;An attack vector in the cloud era is the "cloud" itself. There will
always be security bugs in hypervisors. It's better to just assume the
underlying platform unsecured. This imposes quite a challenge. One can
easily turn into fully paranoid mode considering what the underlying
platform is able to do to his or her VMs. There isn't really not much
I can do. But again, my basic assumption is that my data isn't too
valuable to an attackers, so off-the-shelf encryption is good enough
for me. In fact, I do full disk encryption in my VMs most of the time,
so that no-one can peek into my disk image when it's offline. I also
stay away from the pre-baked images from the providers, so that I'm
immune to mis-configuration in their scripts or bug in the image.&lt;/p&gt;
&lt;p&gt;I don't really consider having an IDS like Tripwire or AIDE running. It would
be relatively easy to observe abnormal traffic to determine if my box is
compromised. IDS doesn't provide much value in the threats &lt;em&gt;I&lt;/em&gt; face.
Furthermore, by the time IDS or any other mechanism discovers an intrusion has
happened the highest priority is to migrate all data to a safe place. Figuring
out what bugs lead to the intrusion is irrelevant in that context. It would be
nice though if there is tool that is integrated with Linux distribution.
FreeBSD has `freebsd-update IDS', which is convenient and useful to a degree. I
haven't found similar utility in the Linux distribution I use and it's not
likely to happen in the future because fundamentally Linux distribution is not
developed as an entity but built with a bunch of loosely coupled software.&lt;/p&gt;
&lt;p&gt;End of brain dump when I'm building a storage VM.&lt;/p&gt;</summary></entry><entry><title>Free Software is not Free</title><link href="http://blog.liuw.name/free-software-is-not-free.html" rel="alternate"></link><updated>2015-11-16T22:30:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2015-11-16:free-software-is-not-free.html</id><summary type="html">&lt;p&gt;There has been some back and forth discussion on how Xen's security process
sucks. Well, not quite. Though the email was titled that way, the content
didn't have much to do with security process per se. True, there are
technically correct points in that thread (with which I completely agree), but
the underlying theme is disturbing. That thread prompted me into thinking how
open source project functions. What I'm going to dwell on is not Xen project
specific.&lt;/p&gt;
&lt;p&gt;All in all, there are people that don't want to invest time and money into an
open source project but want undue influence. The cold hard fact is that things
just don't work that way. Making a piece of software requires a lot of work.
Whatever great idea floating around needs to be implemented by an actual person
-- that comes down to time and money.&lt;/p&gt;
&lt;p&gt;So while open source projects are often branded as "free" (as in either free
speech or free beer), the work behind them is not.  Open source world nowadays
function differently than 20 years ago. It has developed into a business model
that every major project has some vested interest of some companies. Their
employees act in the interest of the employers and prioritise work items
accordingly. There are hobbyists working on a project but it's hard for them to
make substantive contributions.&lt;/p&gt;
&lt;p&gt;People have expectation that, since open source project is "free", they are
entitled to add more work items even if they don't want to invest time and
money. This is fundamentally wrong. Well, they are "free" to voice their
opinions, but then developers are "free" to ignore part of or all of those
opinions. The door is always open for collaborations though -- everybody is
"free" to join.&lt;/p&gt;
&lt;p&gt;In a proprietary software world, there is normally no public channel to ask for
changes. But I'm pretty sure it all comes down to support contract or whatnot.
I can't imagine someone out of the blue goes to Microsoft and says "the way you
work is wrong, your priority is wrong, let me tell you what to do, but I don't
want to pay you anything".&lt;/p&gt;
&lt;p&gt;And I have to admit I sometimes feel upset for such attitude and entitlement.
The joy of working on open source project is ruined. Not respecting the effort
other people put into the project is just demotivating. And I'm not only
talking about myself or Xen project specifically. There have been several
instances of developer burnout. Many people also wrote essays on similar
subjects.&lt;/p&gt;
&lt;p&gt;In my humble opinion, free software projects are just not "free" as some people
would like them to be. Ultimately, a free software project is what the whole
community makes it.  If one disagrees on the direction of a project, he or she
should join and lead the change for the better.&lt;/p&gt;</summary></entry><entry><title>Of Course Joanna is Right</title><link href="http://blog.liuw.name/of-course-joanna-is-right.html" rel="alternate"></link><updated>2015-11-01T14:50:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2015-11-01:of-course-joanna-is-right.html</id><summary type="html">&lt;p&gt;There have been quite a lot of Xen-bashing for XSA-148. It's no doubt a cluster-fuck, a reflection of the sad state of the art of computer system security. &lt;/p&gt;
&lt;p&gt;As Ian pointed out in his blog post, we collectively choose features over security. Security only matters when there is a big fuck-up like HeartBleed, ShellShock or XSA-148.&lt;/p&gt;
&lt;p&gt;What is even sadder is the attitude of media in general. Journalists actively hunt for headline-worth bugs. With the help of slightly technically incorrect writing that skews the true situation and some other implications along the way, they have many masterpieces that conclude the dismal state of computer system security. Not that they're wrong about the conclusion, just in my opinion they completely miss the important points. Remember VENOM? I didn't mention it in my list of big fuck-ups because it really wasn't anything serious except for its cool name. But media picked on that and started a spree nonetheless, just because the cool name fit right in their headline.&lt;/p&gt;
&lt;p&gt;So what's wrong with the world? In short, we really really really don't care about security. The effort of a small group of people pales in front of the a world that cares more about cool new shiny things.&lt;/p&gt;
&lt;p&gt;In my world view, this is a ever changing world. Software is under constant pressure to evolve to adapt to external environment. No software is perfect. There were, are and going to be major bugs in all seriously written software. &lt;/p&gt;
&lt;p&gt;If one thinks a piece of software is secure just because there is no public security advisories list, then he or she is delusional. Good luck with building things on top of that piece of software without investing significant amount of money performing security audit. And if the media picks on a piece of software because there is such a list, they are actually doing a disservice to the wider community. That drives projects to sweep problems under the carpet.&lt;/p&gt;
&lt;p&gt;Coming back to XSA-148, there is no excuse on Xen project's part. That's a serious bug, period. What should be done next is to use the correct process to avoid such error again, and in similar situation, minimise the impact. Xen community constantly work on procedural improvements, learn from the past and make the future better. A constructive way of moving forward is for Xen community to engage with security researchers to improve the security of Xen. I look forward to that.&lt;/p&gt;</summary></entry><entry><title>On Toxicity on LKML</title><link href="http://blog.liuw.name/on-toxicity-on-lkml.html" rel="alternate"></link><updated>2015-10-18T23:50:00+01:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2015-10-18:on-toxicity-on-lkml.html</id><summary type="html">&lt;p&gt;Sarah Sharp quit, publicly.&lt;/p&gt;
&lt;p&gt;Just to be clear: I don't know her personally. The closest link she and I ever had was that I participated in Outreachy (formally known as Outreach Program for Women) as a mentor, where she had been one of the coordinators for some time.&lt;/p&gt;
&lt;p&gt;I do, however, have a lot of sympathy for her. And I would say I agree, to some degree, what she said in her blog post. The toxicity on LKML is appalling. Luckily I wasn’t on the receiving end of any of those, but even as a bystander I felt it was too much every time I saw crude rudeness. In short, LKML never ceased to amaze me on the level of rudeness.&lt;/p&gt;
&lt;p&gt;And it becomes increasingly clear to me that many LKML users (not necessarily Linux kernel developers) are seeing rudeness as necessity to ensure code quality. If you don’t believe me, just look at comment section under Sarah's story on LWN.net.&lt;/p&gt;
&lt;p&gt;I firmly believe that an open source community should be blunt to technical issues but respectful to people. I use that as my guideline to deal with people in my day to day work. As someone who has been making a living by contributing to a number of open source projects, I start to question whether it is worth putting more effort in Linux kernel development. But then I realise I’ve already been retreating from Linux kernel development long time ago. The decision was made subconsciously without me even noticing it. Of course I still work on it when I’m paid to do so, but other than that, I don’t think I will spend my spare time on it anymore.&lt;/p&gt;</summary></entry><entry><title>Anatomy of Xen Alternative Infrastructure</title><link href="http://blog.liuw.name/anatomy-of-xen-alternative-infrastructure.html" rel="alternate"></link><updated>2014-08-29T15:03:00+01:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2014-08-29:anatomy-of-xen-alternative-infrastructure.html</id><summary type="html">&lt;p&gt;Frankly speaking I think the name "alternative" is a bit unclear to
outsider. It's used to patch kernel raw machine code during
runtime. Why is it useful? It gives you a chance to selectively patch
machine code according to CPU features and vendors.&lt;/p&gt;
&lt;p&gt;Xen borrows a stripped down version of alternative infrastructure from
Linux kernel. It's more concise and easier to understand, because Xen
applies alternative instructions before SMP initialisation and it
doesn't support altering instructions after everything else is up and
running. The implementation in Linux is more complex as it has more
functionalities. The core principle remains the same in Xen, however.&lt;/p&gt;
&lt;p&gt;This infrastructure is only used in x86 architecture at the moment so
files are placed under x86 folders. There are only two files,
xen/arch/x86/alternative.c and xen/include/asm-x86/alternative.h.&lt;/p&gt;
&lt;p&gt;Let's look at the header file first.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;struct alt_instr {
    s32 instr_offset;       /* original instruction */
    s32 repl_offset;        /* offset to replacement instruction */
    u16 cpuid;              /* cpuid bit set for replacement */
    u8  instrlen;           /* length of original instruction */
    u8  replacementlen;     /* length of new instruction, &amp;lt;= instrlen */
};

#define OLDINSTR(oldinstr)      &amp;quot;661:\n\t&amp;quot; oldinstr &amp;quot;\n662:\n&amp;quot;

#define b_replacement(number)   &amp;quot;663&amp;quot;#number
#define e_replacement(number)   &amp;quot;664&amp;quot;#number

#define alt_slen &amp;quot;662b-661b&amp;quot;
#define alt_rlen(number) e_replacement(number)&amp;quot;f-&amp;quot;b_replacement(number)&amp;quot;f&amp;quot;

#define ALTINSTR_ENTRY(feature, number)                                       \
        &amp;quot; .long 661b - .\n&amp;quot;                             /* label           */ \
        &amp;quot; .long &amp;quot; b_replacement(number)&amp;quot;f - .\n&amp;quot;        /* new instruction */ \
        &amp;quot; .word &amp;quot; __stringify(feature) &amp;quot;\n&amp;quot;             /* feature bit     */ \
        &amp;quot; .byte &amp;quot; alt_slen &amp;quot;\n&amp;quot;                         /* source len      */ \
        &amp;quot; .byte &amp;quot; alt_rlen(number) &amp;quot;\n&amp;quot;                 /* replacement len */

#define DISCARD_ENTRY(number)                           /* rlen &amp;lt;= slen */    \
        &amp;quot; .byte 0xff + (&amp;quot; alt_rlen(number) &amp;quot;) - (&amp;quot; alt_slen &amp;quot;)\n&amp;quot;

#define ALTINSTR_REPLACEMENT(newinstr, feature, number) /* replacement */     \
        b_replacement(number)&amp;quot;:\n\t&amp;quot; newinstr &amp;quot;\n&amp;quot; e_replacement(number) &amp;quot;:\n\t&amp;quot;

/* alternative assembly primitive: */
#define ALTERNATIVE(oldinstr, newinstr, feature)                        \
        OLDINSTR(oldinstr)                                              \
        &amp;quot;.pushsection .altinstructions,\&amp;quot;a\&amp;quot;\n&amp;quot;                         \
        ALTINSTR_ENTRY(feature, 1)                                      \
        &amp;quot;.popsection\n&amp;quot;                                                 \
        &amp;quot;.pushsection .discard,\&amp;quot;aw\&amp;quot;,@progbits\n&amp;quot;                      \
        DISCARD_ENTRY(1)                                                \
        &amp;quot;.popsection\n&amp;quot;                                                 \
        &amp;quot;.pushsection .altinstr_replacement, \&amp;quot;ax\&amp;quot;\n&amp;quot;                  \
        ALTINSTR_REPLACEMENT(newinstr, feature, 1)                      \
        &amp;quot;.popsection&amp;quot;
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;ALTINSTR_ENTRY is the equivelant of struct alt_instr in assembly.&lt;/p&gt;
&lt;p&gt;For example, stac and clac are defined using alternative mechanism.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;static always_inline void clac(void)
{
    /* Note: a barrier is implicit in alternative() */
    alternative(ASM_NOP3, ___stringify(__ASM_CLAC), X86_FEATURE_SMAP);
}
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;&lt;em&gt;alternative&lt;/em&gt; is a wrapper of ALTERNATIVE. So in effect this inline
function defines 3 NOPs first, because machine code of clac is 3 bytes
long. Then an alternative instruction entry is created in
.altinstructions section. A discard entry created in .discard
section. Finally the alternative instructions used to replace the
original ones are stored in .altinstr_replacement section.&lt;/p&gt;
&lt;p&gt;To extend this snippet into assembly code.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;asm volatile (ALTERNATIVE(ASM_NOP3, __stringify(__ASM_CLAC), X86_FEATURE_SMAP) : : : &amp;quot;memory&amp;quot;);

asm volatile (
    &amp;quot;661:\n\t&amp;quot; ASM_NOP3 &amp;quot;\n662:\n&amp;quot;

    &amp;quot;.pushsection .altinstructions,\&amp;quot;a\&amp;quot;\n&amp;quot;                         \
    &amp;quot; .long 661b - .\n&amp;quot;                             /* label           */ \
    &amp;quot; .long 6631f - .\n&amp;quot;        /* new instruction */ \
    &amp;quot; .word &amp;quot; __stringify(X86_FEATURE_SMAP) &amp;quot;\n&amp;quot;             /* feature bit     */ \
    &amp;quot; .byte 662b - 661b\n&amp;quot;                         /* source len      */ \
    &amp;quot; .byte 6641f - 6631f\n&amp;quot;                 /* replacement len */
    &amp;quot;.popsection\n&amp;quot;                                                 \

    &amp;quot;.pushsection .discard,\&amp;quot;aw\&amp;quot;,@progbits\n&amp;quot;                      \
    &amp;quot; .byte 0xff + (6641f - 6631f) - (662b - 661b)\n&amp;quot;
    &amp;quot;.popsection\n&amp;quot;                                                 \

    &amp;quot;.pushsection .altinstr_replacement, \&amp;quot;ax\&amp;quot;\n&amp;quot;                  \
    &amp;quot;6631:\n\t&amp;quot; __stringify(__ASM_CLAC) &amp;quot;\n6641:\n\t&amp;quot;   
    &amp;quot;.popsection&amp;quot;

: : : &amp;quot;memory&amp;quot;)
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;When Xen boots up, it iterates through .altinstructions sections,
picks up each entry and patches call sites if the required feature bit
is met. See alternative.c:apply_alternatives. If the required feature
is not available, those instructions remain NOPs.&lt;/p&gt;
&lt;p&gt;If you inspect the object file that contains functions that are
implemented with alternative mechanism (for example, usercopy.c calls
stac and clac), you can see:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;objdump -h xen/arch/x86/usercopy.o
xen/arch/x86/usercopy.o:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
 &lt;span class="m"&gt;0&lt;/span&gt; .text         000001e9  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  &lt;span class="m"&gt;00000040&lt;/span&gt;  2**2
                 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
 &lt;span class="m"&gt;1&lt;/span&gt; .data         &lt;span class="m"&gt;00000000&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  0000022c  2**2
                 CONTENTS, ALLOC, LOAD, DATA
 &lt;span class="m"&gt;2&lt;/span&gt; .bss          &lt;span class="m"&gt;00000000&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  0000022c  2**2
                 ALLOC
 &lt;span class="m"&gt;3&lt;/span&gt; .altinstructions &lt;span class="m"&gt;00000048&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  0000022c  2**0
                 CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
 &lt;span class="m"&gt;4&lt;/span&gt; .discard      &lt;span class="m"&gt;00000006&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  &lt;span class="m"&gt;00000274&lt;/span&gt;  2**0
                 CONTENTS, ALLOC, LOAD, DATA
 &lt;span class="m"&gt;5&lt;/span&gt; .altinstr_replacement &lt;span class="m"&gt;00000012&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  &lt;span class="m"&gt;0000000000000000&lt;/span&gt;  0000027a  2**0
                 CONTENTS, ALLOC, LOAD, READONLY, CODE
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Disassembling .altinstr_replacement section yields:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;objdump -d -j .altinstr_replacement xen/arch/x86/usercopy.o

xen/arch/x86/usercopy.o:     file format elf64-x86-64

Disassembly of section .altinstr_replacement:

&lt;span class="m"&gt;0000000000000000&lt;/span&gt; &amp;lt;.altinstr_replacement&amp;gt;:
   0:   0f &lt;span class="m"&gt;01&lt;/span&gt;                   &lt;span class="o"&gt;(&lt;/span&gt;bad&lt;span class="o"&gt;)&lt;/span&gt;
   2:   cb                      lret
   3:   0f &lt;span class="m"&gt;01&lt;/span&gt;                   &lt;span class="o"&gt;(&lt;/span&gt;bad&lt;span class="o"&gt;)&lt;/span&gt;
   5:   ca 0f &lt;span class="m"&gt;01&lt;/span&gt;                lret   &lt;span class="nv"&gt;$0x10f&lt;/span&gt;
   8:   cb                      lret
   9:   0f &lt;span class="m"&gt;01&lt;/span&gt;                   &lt;span class="o"&gt;(&lt;/span&gt;bad&lt;span class="o"&gt;)&lt;/span&gt;
   b:   ca 0f &lt;span class="m"&gt;01&lt;/span&gt;                lret   &lt;span class="nv"&gt;$0x10f&lt;/span&gt;
   e:   cb                      lret
   f:   0f &lt;span class="m"&gt;01&lt;/span&gt;                   &lt;span class="o"&gt;(&lt;/span&gt;bad&lt;span class="o"&gt;)&lt;/span&gt;
  11:   ca                      .byte 0xca
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;0f 01 ca and 0f 01 cb are machine code for clac and stac.&lt;/p&gt;
&lt;p&gt;Use gdb to look at call site of stac:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gdb xen/arch/x86/usercopy.o
&lt;span class="o"&gt;(&lt;/span&gt;gdb&lt;span class="o"&gt;)&lt;/span&gt; disas /r __copy_from_user_ll
Dump of assembler code &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; __copy_from_user_ll:
   0x000000000000003c &amp;lt;+0&amp;gt;: &lt;span class="m"&gt;55&lt;/span&gt;  push   %rbp
   0x000000000000003d &amp;lt;+1&amp;gt;: &lt;span class="m"&gt;48&lt;/span&gt; &lt;span class="m"&gt;89&lt;/span&gt; e5    mov    %rsp,%rbp
   0x0000000000000040 &amp;lt;+4&amp;gt;: &lt;span class="m"&gt;89&lt;/span&gt; d1   mov    %edx,%ecx
   0x0000000000000042 &amp;lt;+6&amp;gt;: &lt;span class="m"&gt;66&lt;/span&gt; &lt;span class="m"&gt;66&lt;/span&gt; &lt;span class="m"&gt;90&lt;/span&gt;    data32 xchg %ax,%ax
   0x0000000000000045 &amp;lt;+9&amp;gt;: &lt;span class="m"&gt;48&lt;/span&gt; &lt;span class="m"&gt;89&lt;/span&gt; c8    mov    %rcx,%rax
   0x0000000000000048 &amp;lt;+12&amp;gt;:    &lt;span class="m"&gt;48&lt;/span&gt; &lt;span class="m"&gt;83&lt;/span&gt; f9 0f cmp    &lt;span class="nv"&gt;$0xf&lt;/span&gt;,%rcx
&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;The line 66 66 90 is in fact ASM_NOP3 if you look at its definition.&lt;/p&gt;
&lt;p&gt;If the requested feature is not present, ASM_NOP3 remains
untouched. Otherwise it's replaced with 0f 01 cb.&lt;/p&gt;
&lt;p&gt;The patching procedure can be seen in alternative.c. It's quite
straightforward -- just plain memcpy.&lt;/p&gt;
&lt;p&gt;This is it. This post mainly targets beginners who are interested in
tricks in low level programming. It does requires certain level of
understanding of the tools though. Fortunately the manuals of those
tools are excellent so I won't go into details on how to use those
tools.&lt;/p&gt;</summary></entry><entry><title>Clock is monoid</title><link href="http://blog.liuw.name/clock-is-monoid.html" rel="alternate"></link><updated>2014-08-12T23:10:00+01:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2014-08-12:clock-is-monoid.html</id><summary type="html">&lt;p&gt;I bumped into "functional programming" several times during my years in college
and now I'm planning to spend serious effort in learning it. My language of
choice, after evaluating several, is Haskell.&lt;/p&gt;
&lt;p&gt;But this post is not about Haskell. I mentioned Haskell because I encountered
many mathmatical concepts when learning, such as functor, moinoid and monad.
They are a bit far-fetched for me because I have a background of low level
programming.&lt;/p&gt;
&lt;p&gt;Recently I came across a video on Youtube, in which monoid is not depicted as
pure mathmatical but something we know everyday. The clock we see everywhere is
actually an instance of monoid.&lt;/p&gt;
&lt;p&gt;For S with some binary operation 'dot' which maps S x S -&amp;gt; S to be a monoid, it
has to sastify two axioms:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For all a, b and c in S, (a 'dot' b) 'dot' c = a 'dot' (b 'dot' c)&lt;/li&gt;
&lt;li&gt;There exists identity element u that every elements a in set S a 'dot' u = u
  'dot' a = a&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For a clock, the identity element is 12 and 'dot' can be defined as
'dot' a b = (a + b) % 12. &lt;/p&gt;
&lt;p&gt;Pretty neat, isn't it?&lt;/p&gt;
&lt;p&gt;There's a lot of other information in that video. The discussion on
composibility is quite delightful as well.&lt;/p&gt;
&lt;p&gt;And here is the link.&lt;/p&gt;
&lt;p&gt;https://www.youtube.com/watch?v=ZhuHCtR3xq8&lt;/p&gt;</summary></entry><entry><title>The CAPTCHA Economy</title><link href="http://blog.liuw.name/the-captcha-economy.html" rel="alternate"></link><updated>2014-05-17T10:50:00+01:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2014-05-17:the-captcha-economy.html</id><summary type="html">&lt;p&gt;During daily lunch time chat yesterday in the office kitchen, my friend told me about a service he has been using for a while. And that provoked me to look further into this and discovered some interesting facts. Though not all of my questions about that service have been answered (either by him or my own investigation), I would still like to write down my thought so far.&lt;/p&gt;
&lt;p&gt;The service in question is a kind of service that use human power to recognise CAPTCHA images. As a client, you go to a website, register, top up some credit, then invoke their API to send images and get back results. Those CAPTCHA images are distributed to workers around the globe, then text result returned. There is a bidding system in place. If request volume is high then the price goes up, otherwise it goes down.&lt;/p&gt;
&lt;p&gt;My friend told me about one particular site he's using, but he also mentioned that there's tons of similar sites on the Internet. The service is charged at per request basis. The average cost for hiring someone behind a computer to recognise one image is about 1.5 cents. My friend told me that he paid $10 one year ago and he still had $7 credit when we had the conversation.&lt;/p&gt;
&lt;p&gt;I kind of think of this kind of service a clever hack to leverage machine power (to distribute and collect) and human power (to recognise patterns) to achieve certain goal. However, what's astonishing is that the average cost is so low that I don't even understand how this system manage to sustain itself. I mean, are those workers really get paid enough to feed themselves? &lt;/p&gt;
&lt;p&gt;I actually opened that site my friend is using when I got home. What's handy is that it shows various stats for its service. I can see, with the bidding system in place, the price for 1000 requests goes from $0.75 to $1.5. Most of the time the price stays around $1. That is, in my opinion, still too low. However I'm certainly wrong because this system seems to work fine. The site states it has been in business since 2007. Further down the site there is a pretty pie chart showing the distribution of workers worldwide. High on the top are Pakistan, India, Indonesia, Vietnam etc, which consist of more than 70% of all workers. China is also on the list with a number of ~3%. Interestingly USA is also on the pie chart, and it has more or less the same percentage of China. Russian and Ukrainian workers are also non-neglectable. The number of workers tops at 1000.&lt;/p&gt;
&lt;p&gt;My friend and I did some back-of-the-envelope calculation on the revenue of this service. Unfortunately I didn't take note about it so I'm here to do it again, from scratch. According to the material on the website, the average time to recognise an image is 15 seconds. But we decided to make it 10 seconds, given our own experience dealing with CAPTCHA images. So, for an average person like my friend and me, he can probably finish 6 requests per minute. That makes 360 requests per hour. Multiple that by 1000 (the maximum number of workers seen on the website from previous day's data), then divide that by 1000 (the average price for 1000 requests was $1). That yields the result of the maximum possible hourly revenue of that service, that is, $360/h. We had no idea the percentage that the service provider takes. If it is 10%, that makes it $60/h. I think that's probably enough to sustain the cost of maintenance, bandwidth etc, even enough to make small profit out of it.&lt;/p&gt;
&lt;p&gt;It might be profitable from service provider's point of view, but I still cannot see how feasible it is for a worker to make a living out of it. I also told another friend about this service and had a short discussion. Her idea is that probably the workers are not treating it as a full time job. But my point stands still, it's not profitable for a worker because you need a computer and Internet connection to do that job. The payback is not even likely to cover the expense, let alone make a profit / living out of it.&lt;/p&gt;
&lt;p&gt;I could be wrong, so I decided to go further. There's a line of text of much smaller font size saying "if you want to do data entry job, click here". I clicked that link, another website popped up. A remarkable line from the new website showed that the price can be as low as $0.35 per 1000 entries. I couldn't help asking myself how would anyone want to do this. Nonetheless I registered an worker account (with a newly registered email of course) and logged in.&lt;/p&gt;
&lt;p&gt;The interface is quite simple. The first page is "workspace", where you can look at the image you receive and enter text. For every image you receive there is a count down timer at the bottom, if you don't type at all during that time, the timer goes down. Presumably if the timer goes to zero the image is dispatched to other worker. But if you type anything, the timer is "refilled".&lt;/p&gt;
&lt;p&gt;I entered a few entries. Suddenly my work item was revoked and website told me that I entered one thing wrong -- well, I didn't pay much attention for sure. I was told that if I make 5 mistakes in a month my account will be banned. That's quite scary, given the amount of work you required to do to make so little (1000 entries for $1) and the mistakes you're allowed to make (5 per month). Apparently the dispatch queue was not so full because I idled for once or twice.&lt;/p&gt;
&lt;p&gt;I soon lost my interest in this game and turned to look at the navigation bar. I saw several tabs, among which was "payment". Also there's a "score board"-ish tab. I also given that account to my friend. She kind of found it interesting and looked at it as a game. However, I would doubt if anyone who wants to make money from it enjoys this as much as my friend did. When I got that account back, the system refreshed my payment. I could see that my friend and I entered 26 entries and made $0.013. Well, that's only $0.005 per entry!&lt;/p&gt;
&lt;p&gt;I had a look at the "score board". Top 1 worker made ~$180 last month, while number 100 made ~$42.&lt;/p&gt;
&lt;p&gt;After playing with the system and seeing the "score board", I had the impression that this is not an appealing job for me. I mean, even if someone is in desperate need of money, he or she should not consider doing this, because I don't see possible way to make any money out of it.&lt;/p&gt;
&lt;p&gt;Surely I don't know enough about other Asian countries, I don't really have an idea what their life standard is like. So probably for those countries high on the list, their workers can actually make profit from this job. As for USA workers, I have a theory that they might be homeless with laptop, and have access to free electricity and free wifi -- I've seen report on this lifestyle, they get free laptop from recycling station / donation and free electricity / wifi from McDonald. To be honest, the life standard of USA homeless is quite high compared to other countries, so that they can probably use this system to earn themselves a cup of coffee when they have nothing else to do. But I don't think those USA workers rely on this system to make a living by any means. For the developing countries, the device required to go into this job, the Internet connection and the electricity are not free (not saying that there's no free ride at all, just not as easy to get as in the US).&lt;/p&gt;
&lt;p&gt;Wearing my Chinese hat on, I can say more specific thing. Say I'm a guy from a very small town (by Chinese standard of course) and want to go into this job, I will first need a computer, which needs to be at the very least able to run an graphical desktop environment and a web browser. That will probably cost me 500 Yuan (~$80) if I buy a very very very old second hand computer. Then the Internet connection will probably cost me 30 Yuan (~$5) per month. Assume the power of my computer 200W, working 8 hours per day, unit price per watt 0.6 Yuan (~$0.1), that makes the electricity bill ~1 Yuan (~$0.16) per day. To cover the electricity bill alone, I will need to enter 160 entries. To cover the Internet fee I will need to enter 50000 entries. So, just to cover regular expenses, I will need to enter 5000 + 160 * 30 = 9800 entries (30 is the number of working days, and yes, I consider myself workaholic). To get back my investment on computer I will need to enter another 80000 entries. After all that, I get a job with hourly rate of 2.27 Yuan ($0.36), which is can barely feed myself given the current household level in my hometown. But of course, if I really have nothing else to do, this job is still something to get myself occupied -- any income is better than no income.&lt;/p&gt;
&lt;p&gt;In conclusion, I don't think as a worker it is worthy at all to do this job. However this kind of service exists, however small it is (the said service provider has 1000 workers top). I don't understand why. Probably there's poverty that's way beyond my understanding, probably workers never take on it as sole income source. I would put my bet on the latter.&lt;/p&gt;</summary></entry><entry><title>Two Years as Open Source Software Developer, Retrospect and Forward-looking</title><link href="http://blog.liuw.name/two-years-as-open-source-software-developer-retrospect-and-forward-looking.html" rel="alternate"></link><updated>2014-05-05T19:18:00+01:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2014-05-05:two-years-as-open-source-software-developer-retrospect-and-forward-looking.html</id><summary type="html">&lt;p&gt;I've been a developer for open source
&lt;a href="http://www.xenproject.org"&gt;Xen Project&lt;/a&gt; for two years, and I also
help develop &lt;a href="http://github.com/pythoncn/june"&gt;June&lt;/a&gt; in my spare
time. What I've learned so far is that running a (proper) open source
project is never an easy task.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Open Source&lt;/em&gt; means different thing to different people. To many
people who's new to this concept, probably it just means "putting your
source code in the public so that it can be useful to others".  This
is of course a valid definition, but it's only the first stage of an
open source project. If you want your project to thrive, to have
bigger impact there's much more to do.&lt;/p&gt;
&lt;p&gt;As I understand it, a proper open source project should have:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;publicly avaiable source code&lt;/li&gt;
&lt;li&gt;proper open source license&lt;/li&gt;
&lt;li&gt;proper document to certain degree&lt;/li&gt;
&lt;li&gt;public channel for discussion&lt;/li&gt;
&lt;li&gt;well-defined development process and responsive maintainers&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Though I don't think Github does a great job in providing public
channel for discussion, it does provide a viable solution for #4. It
lowers the barrier to get involved in open source project. Mailing
list is also popular amongst old-school hackers.&lt;/p&gt;
&lt;p&gt;All the rest pretty much depends on project owner / initiator. #1
should be very easy to achieve. #2 and #3 are often overlooked,
however.&lt;/p&gt;
&lt;p&gt;Not having a license is very likely to turn away potential
contributors, because contributors have the right to know how their
code is going to be used and avoid potential legal problems. It might
be hard to choose one license from all those popular ones, but it's
definitely something you need to decide at the very first beginning.&lt;/p&gt;
&lt;p&gt;It's a bit tricky to define "proper document to certain degree". I
personally believe that the minimum degree of documentation should be
able to help a fresh user / developer start using / developing the
software without any major problem.&lt;/p&gt;
&lt;p&gt;The last item requires most time and effort. Many people develop open
source software to get a paycheck, but more develop for fun and
non-profit purpose. Wearing my latter hat as a developer for June, I
often feel I don't have enough time to look at all the missing
features and answer all the quesions. There's really no silver bullet,
all we can do is to devote more time.&lt;/p&gt;
&lt;p&gt;As I mention in the beginning, I develop open source software for both
profit and fun.  How good (bad) did we (the two teams) do? The
following paragraphs are by no means claiming I made any significant
contribution on the whole process. They are just my views and my views
only.&lt;/p&gt;
&lt;p&gt;I think Xen Project is doing well in general. We've got a team of
people who understand the development of open source projects. We try
hard to work with upstream / downstream projects. All development
activities happens in the public mailing list and process is
well-defined. We hold regular Xen Docsday to update documents,
etc. During the last year we've seen some siginifant improvement on
the whole development process, especially after joining the Linux
Foundation.&lt;/p&gt;
&lt;p&gt;As for June, it's not that good. I got involved in June one and a half
years ago. June has always been an open source project since its
genesis. What fell short is the last but most important item -- the
process is not very well-defined and code not very well
maintained. I'm not blaming anyone because noone gets benefit out of
it by developing June, but how can we fix / improve things? Adding new
features is important, however it's just impossible to develop all the
features by my own. And in my opinion only by building an eco-system
(however small) can we make a project healthy and prosper.&lt;/p&gt;
&lt;p&gt;When it comes to June, the core idea is that now we need to build a
minimum viable product for both end users and developers. I've tried
to identify some key steps to bootstrap and attract developers:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;improve documentation&lt;/li&gt;
&lt;li&gt;develop necessary features, better test coverage&lt;/li&gt;
&lt;li&gt;request for contribution&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I managed to spend some time on rewriting several documents and define
a clearer development process. As for feature parity there's still
lots of work to do. There're still lots of missing features on the
TODO list. Unfortunately up to this point the core team is really on
their own, requesting external contribution might be too early at this
stage.&lt;/p&gt;
&lt;p&gt;I'm currently working on #2 on that list, though I cannot say for sure
how much time I can put into it. At least I've got a plan and will try
to stick to it. Let's see how it plays out.&lt;/p&gt;</summary></entry><entry><title>Fix Debian Wheezy's Openswan Regression with Apple Devices</title><link href="http://blog.liuw.name/fix-debian-wheezys-openswan-regression-with-apple-devices.html" rel="alternate"></link><updated>2014-05-04T23:40:00+01:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2014-05-04:fix-debian-wheezys-openswan-regression-with-apple-devices.html</id><summary type="html">&lt;p&gt;In late March Debian Security team pushed a security update to Openswan package in Wheezy. My Macbook Air cannot connect to Openswan anymore after upgrading.&lt;/p&gt;
&lt;p&gt;I searched a bit and found out that somebody already filed a bug report on Debian bug tracker: &lt;a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=744717"&gt; bug 744717 &lt;/a&gt;. There's also a report on Openswan's Github page to confirm this bug: &lt;a href="https://github.com/xelerance/Openswan/issues/78"&gt;issue 78&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;On Debian bug tracker a user named Liu DongMiao provided a patch and it was reported to work. I had a look at that patch but didn't quite like it, because it leaked a macro in C file, which is not a proper fix to me.&lt;/p&gt;
&lt;p&gt;The root cause of the problem is confilict in header file. Debian Security team removed ISAKMP_NEXT_NATD_BADDRAFTS. Then they had to comment out or remove some code to handle that flag, of which Apple devices make use. Unfortunately I cannot get Apple to fix their code so I have to fix mine.&lt;/p&gt;
&lt;p&gt;The fix is simplier than I thought. I compared upstream Openswan 2.6.37 with Wheezy's patched version, then restored the original flag and those code snippets. I was worried that the overriden new flag was used in code but it wasn't. I didn't touch changelog or version number of that package, so that when a new package is out it can be automatically updated when the maintainers push a new package to fix that bug.&lt;/p&gt;
&lt;p&gt;I have my &lt;a href="../files/fix-osx-baddraft-flag.patch"&gt;patch&lt;/a&gt; attached. If you want to know how to rebuild Debian package, &lt;a href="https://www.debian.org/doc/manuals/maint-guide/index.en.html"&gt;Debian New Maintainers' Guide&lt;/a&gt; is a good starting point.&lt;/p&gt;</summary></entry><entry><title>2012年终总结</title><link href="http://blog.liuw.name/2012-year-end-review.html" rel="alternate"></link><updated>2012-12-28T20:40:00+00:00</updated><author><name>Wei Liu</name></author><id>tag:blog.liuw.name,2012-12-28:2012-year-end-review.html</id><summary type="html">&lt;p&gt;由于2012年12月31日是周一，是一个工作日，估计到时也没有时间来写blog了，所以趁着周末来写一下吧。&lt;/p&gt;
&lt;p&gt;上一个基于Wordpress的blog，由于VPS到期，大家都没打算继续在其上投资，所以下线了。之后我便萌生了改用如Pelican、Octopress、Hyde以及Jekyll这样的静态blog系统的想法。在七到九月间进行了一些尝试，但是鉴于Wordpress中使用了一些插件以及一些格式上的问题，旧文章的转换效果总是不尽如人意。而我自己又比较懒于重复造轮子，所以重新开blog的计划就一直拖到了现在。&lt;/p&gt;
&lt;p&gt;现今已经是年末，再不动手只怕这个事情要从Todo list移动到Undo list，最后到Never-do list了，所以还是快点动手吧。现在选了自己比较熟悉的Python做的Pelican，先从头开始写。旧blog中的一些文章，若是我想到比较有用或者有意思的，会手工转到Pelican中来。&lt;/p&gt;
&lt;p&gt;每个人的一年，都有不同的精彩和失落，但是最后总可以总结到一句话：如人饮水，冷暖自知。文字写多了，也不免变成流水账，乏味得让人犯困。所以鸡毛蒜皮的事情我就不写了，只记重要的事情。&lt;/p&gt;
&lt;p&gt;一到四月的时候，在剑桥实习。春节自然是没有回家，虽然有12天的假期，但是机票委实太贵，加之去年十一月末出来的时候已经在家待过一周，所以最后还是选择留下继续工作。总的来说，实习的表现还不错。当时的想法就是挑战一下自己看能不能在四个月之内把这么大的一个坑填上，可惜network I/O这个坑实在太大，写了四轮patch还是才只能说是刚刚开始，所以最后只能先把这个坑留着了。期间fix一些小bug什么的，基本不在话下。&lt;/p&gt;
&lt;p&gt;四月初，因为实验室项目的需要，回国后直接高铁回到武汉，二十多个小时没有休息。回到武汉后第二天又马上开始工作，我当时都觉得自己精力真好。项目顺利过关后，开始着手毕业工作。虽然很幸运（不幸）地被抽中到学校进行论文查重，由于论文都是自己慢慢敲的，所以顺利过关。不过文章里面说的系统没有能完全做好，代码也不见了，此为又一憾事。期间的另外一个遗憾就是曾经答应过一个华科的同学，说是回来后到华科和他们聊一聊，最后还是没能成行，最后只能和他们说一声对不起了。&lt;/p&gt;
&lt;p&gt;七月底，转战杭州，加入淘宝核心系统部。我被分到多隆的手下，接手毕玄主导的T4系统的一些开发维护工作。平时的工作不算繁重，公司大了有各种各样的限制，所以总体效率不算高。由于T4也有内核方面的改动，所以也认识了内核组的高阳和含黛，成为不错的朋友。加上参加“洗脑”培训以及技术培训，也算是认识了不少的朋友。可以说，认识的这些主管、技术大牛以及各个业务线、产品线的朋友，是我在淘宝最大的收获。技术方面的长进相对来说少一点，最大的收获就是体验到了互联网公司的技术氛围：迭代快、要落地、在优雅和速度之间取得平衡。&lt;/p&gt;
&lt;p&gt;在再三思虑之后，我还是决定趁着年轻的时候出来闯荡一下，所以最后还是离开了淘宝，接受了Platform Team的offer。这里也多得了多隆的支持，我一提出来，他很爽快地答应了。我想这就是所谓的君子之交吧，虽然他是我的上级，但是平时他总是待我如朋友一般，也很为我的个人发展着想，真的要多谢他。&lt;/p&gt;
&lt;p&gt;于是在回家休息半个月之后，又登上了来英国的飞机。有趣的是，我到达的日期只和去年差了四天。再次回来，心里早已没有初来乍到的那种忐忑。轻车熟路地入职、找房子、搬家，一切的一切，都是如此的按部就班。Platform Team甚至连我的桌子都留着给我原封不动。回来之后马上又开始挖一个新的坑，希望2013年一月的时候能填完吧。&lt;/p&gt;
&lt;p&gt;总结一下，2012是很折腾的一年，顺利毕业，走南闯北上山下海。外面的世界很精彩，我从很多不同的人身上学到了不同的东西，看了很多以前没有看过的风景；外面的世界也很无奈，我不得不面对相伴多年的两个人分道扬镳的事实。最后还是那一句话，如人饮水，冷暖自知。希望2013年里面，所有我爱的人和爱我的人，所有认识的和不认识的朋友，都能越过越好；所有知道和不知道的幸福，都能越来越多；所有了解和不了解的苦难，都会越来越少。&lt;/p&gt;
&lt;p&gt;此致&lt;/p&gt;</summary></entry></feed>