Liuw's Thinkpad

想要赢就先学会输,想要成功就先学会失败

Archive for the ‘xen’ tag

Xen netback改进

without comments

下面的文字是由Ian Campbell的邮件整理出来的,还有一些我自己的想法和查到的资料。

现在Xen netback的基本工作模式还是copying model,也就是说前后端数据交换的时候不是zero copy的。IanC目前正在向上游反馈skb paged fragment desctructor补丁,这系列补丁可以让backend对guest RX做grant mapping,从而实现zero copy。在copying model上进行后续工作意义不大。

紧接着就是一系列的重构。

目前的模型是driver domain的每个VCPU配置一个netback worker,这些netback worker在多个VIF之间共享。打算改为每个backend配置一个netback worker。在进行这样的改进时必须要注意内存的使用情况,原来的model中内存的使用基本是固定的,新model由于worker数目和backend数目相关,所以要有一定的扩展性。可以考虑使用内存池来防止worker过多导致内存消耗过大的问题。但是初步实现的话,静态内存分配也是可以接受的。

内部接口改用NAPI,这项改进依赖于前一项。大部分设备的TX completion都比较廉价,所以NAPI使用中最大的开销就是RX。但是切换过去之后(NAPI使用tasklet),就要考虑VCPU是否会过载(这也是目前还使用thread的原因)。这一部分需要仔细的测试。

再接下来的工作依赖于NAPI。

Receiver (guest) side copy,说得太模糊,不大清楚IanC的意思。

Multiple queues for TX and RX,这个可能和Stefano提到的VMDQ有关,把dispatch的的任务offload到硬件(原来是软件的bridge)。现在Intel的万兆网卡已经提供支持了,在VMWare上实现及测试的结果是有4x吞吐量约有1.3x的提升(4->9.2)。

除了上面的一系列想法之外,还有一个相对独立点的改进——netback对SR-IOV的支持。

想法很多,目前也不系统,先列一下,留待后用。

Written by liuw

October 17th, 2011 at 10:17 am

Posted in Tech

Tagged with , ,

Follow-ups of Virtio on Xen

without comments

This post should have finished a long time ago. Sorry folks, I don’t know if I really have time to finish what I left over, but it is really necessary to write it down so that someone (me?) will pick it up and move on…

I wrote some of my thoughts in Xenwiki’s VirtioOnXen page. However, those items are high-level and abstract. Now I will explain my TODOs for the second iteration.

* Enable Xen mapcache for Virtio for PV

The prototype has very bad exec implementation. For every READ/WRITE to memory, it first maps DomU’s memory, then r/w, then unmaps it. A single operation will cause two page table updates. That’s not ideal.

However it is not easy to enable mapcache in PV case. Xen’s PV memory model is quite different from HVM’s. Xen mapcache is originally designed for HVM and tightly coupled with QEMU – it is stubbed into cpu_physical_memory_rw. It is not easy to unify HVM and PV memory model (is it really possible?) and reuse cpu_physical_memory_rw.

What I can see to get this done is first we should re-factor the mapcache to de-couple it from the existing code. Then we can use the mapcache for PV exec.

* Squash two evtchns into one, eliminate locking

Currently two evtchns are used in the transport layer. One (evtchn1) for transport layer notification “be I need to control the device” (FE->BE), the other (evtchn2) is for backend notification telling drivers “hey, you have data waiting” (BE->FE).

Some may wonder why do we need two in the first place. All I can explain is that I want to emulate the behavior of Virtio in HVM’s case, that is synchronous “trap-process-return”. PV cannot really “trap”, it can only spin wait for a common variable (or a bit). With these two evtchns, locking is necessary, because “process” may trigger evtchn2, to which a IRQ handler is bound. That re-activates DomU from a different code path, and the transport layer never gets a chance to “trap” again.

So, how does squashing evtchns help producing lock-free code? We abandon evtchn2 and introduce a bit to indicate notification. We can defer data processing to work queue. This is how xen-pcifront and xen-pciback work.

To achieve this goal, we need to first introduce bitops into QEMU, because we have two bits now: BE acks FE’s control request and BE tells FE there is something waiting in the ring. These two bits must be handled in strict order, I would not go deep here, see xen-pcifront and xen-pciback for details. After introducing bitops, we can start re-factoring the transport layer, which should be easy.

* Enable Virtio device DMA capability

Don’t know if this is upstreamable, but I think it is worth trying. Virtio is designed for hardware-assisted virtualization, it holds an assumption that backend from the device’s point of view (say, qemu-kvm) can access the same address space as CPU does. However, in real world, devices don’t always have the same view as CPU does.

In order to create memory space consistent across CPU and backend, use of DMA API is necessary. Unfortunately, Virtio device doesn’t response correctly to DMA API (because they don’t have to, given the assumption mentioned earlier). So in current code, the ring is allocated by DMA API with NULL device, which is bad practice.

This task may not be trivial. Coding doesn’t require much effort. What concerns me most is keeping Virtio intact. If we are to alter simple kmalloc into DMA API, we have to change a lot of its design and internal too.

I swear that I’ve tried my best to explain my minds. To fully understand my decisions, you may have to read the fxxking code. :-)

Written by liuw

September 16th, 2011 at 8:08 pm

Posted in Tech

Tagged with , ,

惊喜来得太迟,幸福走得太快

with 5 comments

好吧,标题党了。

早起查邮件,收到Stefano的来信,大意就是说咱的GSoC做得还不错,想不想到Linux Plumbers Conference 2011上做个talk,旅费Citrix包了。这个会是9月7日在加州Santa Rosa,Citrix的总部搞的。

当时心里挺高兴的,心想我这土鳖也有机会去见见世面了。想到能看到很多大牛心里那个爽啊。

但是一查这个手续,先办护照,再加签证,这一个月的时间不大够啊。问了一下师兄这个事情要怎么办,很麻烦。

要办护照,要先到院里开证明,然后到研究生院开证明,然后到户籍科拿出户口,再然后到汉口的出入境管理处申请。运气好的话,10天搞定,运气不好,打回原籍审查,20天。我在武汉居住了6年,估计要回原籍。最吐血的是今天去研究生院鸟人没一个。明后天必然又不上班。所以户口拿出来还没个定数。

假如我运气够好,10天拿到护照,也已经是8月10日。然后向美国使馆预约面签(这里还得先花1k左右的钱),日前一般预约是一个月才能排到,所以没戏。

假如我运气还够好,申请到紧急签证,满打满算10天,8月20号。然后去广州或者上海面签,旅费至少1.5k。

好吧,得到面签机会了。按照我目前的情况,无工作,无收入证明,还去参加公司举办的技术会议,我要是说我不是过去找工作的,估计签证官也不会信,面完后估计马上拒掉。即使不拒,也可能要经过3到4周的审查期。到时候会都开完了。

所以这么多个步骤,我必须人品大爆发才有可能成行。对于我这个人品一向很差的人来说,mission impossible。

惊喜来得太晚,幸福走得太快。最后决定还是放弃了吧。虽然大家都为我惋惜,但是我觉得以后机会还是会有的,所以心里其实也没有太过郁闷。反而想想得到这个机会也是社区对目前工作的肯定--虽然这些工作还是很粗糙。

不过最后我还是决定把护照先办下来了,以后可能还是用得着的。机会总是给有准备的人的。

Written by liuw

July 29th, 2011 at 11:47 am

Posted in 生活

Tagged with , ,

Xen虚拟机调试的小技巧

without comments

1. xenctx分析上下文

在VM的配置文件中加入

on_reboot="preserve"
on_crash="preserve"

等配置。出问题之后可以用xenctx取得VM的上下文。

2. 调试QEMU

在VM配置文件中使用device_model_override指定一个脚本作为DM,这个DM中包括如下的语句

echo $@ > /tmp/qemu-dm
sleep 1h

然后再手动用gdb启动QEMU后端。注意xl有一个timeout,所以动作要够快。或者自己把xl的timeout改一下。

Written by liuw

July 27th, 2011 at 2:04 pm

Posted in Programming

Tagged with , , ,

Status update of Virtio on Xen project

without comments

Hi everybody, it’s midterm of Google Summer of Code now, let me tell you what I’ve done and learned during this period.

I started working on the project in the community bonding period. I took Virtio on Xen HVM as my warming up phase, which would help me understand QEMU and Virtio implementation better. Luckily, it did not require much work to get Virtio work on Xen HVM. At the end of the community bonding period, I wrote a patch to enable MSI injection for HVM guest, which has been applied to the tree.

Then I started to work on Virtio for pure PV. That’s not trivial. I spent lots of time trying to implement a Virtio transport layer with Xenbus, event channel and grant table, which is called virtio_xenbus (corresponding to current Virtio transport layer virtio_pci, which utilizes virtual PCI bus). The new transport layer must retain same behavior of the old one. However, one fundamental difference between evtchn and vpci is that, vpic works in a synchronous way while evtchn is born asynchronous. I got inspired by xen-pcifront and xen-pciback and finally solved this problem. Ah, a working transport layer finally.

But porting Virtio for pure PV needs more than a working transport layer. Vring, which is responsible for storing data, also needs some care. The original implementation uses kmalloc() to allocate the ring. It is OK to use kmalloc to get physical contiguous memory in HVM. However, Xen PV backend needs to access machine contiguous memory. So we have to enable Xen’s software IOTLB and replace kmalloc() with DMA API. Also, the physical address in scatter gather list should be replaced with machine address. So here we get a Vring implementation for pure PV guest.

Is that all? No. One feature we need to disable is the indirect buffer support. Because this feature causes specific driver to allocate buffers with kmalloc() in a much upper level. I tangled with this problem for sometime, finding that I would rather leave those drivers alone than break them. So I chosed to disable this feature at the moment. But this feature is critical to good performance, so I may try to enable it someday.

Good, we finally have our foundation ready! Let’s start to tangle with specific drivers. I chose Virtio net driver as a start. Every driver has its own features. As mentioned above, we should avoid allocating buffer with kmalloc() in driver level, so the CTRL_VQ feature needs to be disabled. In fact, I have no driver features enabled at the moment. What makes me really happy is that Virtio net almost works out of the box. I just want to make sure things work, pre-mature optimization is evil.

What to do next? Virtio blk is my next goal. Hopefully it would not take too long because I’m a bit behind schedule. Then I will start to port SPICE for Xen. Then try to enable more features of Virtio net/blk and gain better performance. That’s the plan. Time is very limited, I feel excited.

I’ve learned a lot during this period. I work together with the community. The interaction works out quite well. I discussed a lot with Xen developers and got a better understanding of Xen and QEMU, as well as Virtio itself.

Last but not least, I want to thank Stefano Stabellini, Ian Campbell, Konrad Wilk and those who helped me through my project in this hot summer. I would not have come so far without your help.

Written by liuw

July 16th, 2011 at 10:34 pm

Posted in Programming,Tech

Tagged with , ,

开始VirtIO for pure PV

without comments

上周给Xen写了几个Patch。

1. 简单的typo fix,没什么好说的。
2. 为模拟设备注入MSIX而写的HVMOP,进入staging。QEMU部分也发到qemu-devel了,但是QEMU开发者说模拟MSIX的注入还要再讨论一下,那我就等等吧。
3. 给libxl加上VirtIO Disk的支持,这个还要再完善。

现在其实VirtIO for HVM已经差不多了。下面要开始最难的VirtIO for pure PV的工作了。

Stefano说we don’t exactly know how long it is going to take,要考虑的东西还是挺多的:

1. Xenstore里面要写什么东西。
2. VirtIO要怎么初始化。
3. 底层相关实现的替换(evtchn等)。

而这些工作,又要和现有的功能和平相处。所以还要再了解现有的设计,才能提出合理的设计。

这两天就先看看现有代码。一是Linux kernel里面VirtIO的代码,二是Xen PV初始化的代码,三是参考PV net等等的Xenstore参数形式。再和Stefano及Konrad再深入讨论一下。

Written by liuw

May 30th, 2011 at 8:32 am

Posted in Programming

Tagged with , ,

VirtIO for HVM进展

with 4 comments

虽然我说要尽快尽快开始,但是实际上进展不快。因为很多时候都是在调其他的bug。Xen unstable和SeaBIOS配合的时候,对IRQ配置有分歧,所以最后IRQ的注入有问题。

由于对IO-APIC和LAPIC不熟悉,在这里卡了很久。最后还是Stefano把这个bug解决了。惭愧。

这个bug解决之后,反而VirtIO for HVM的主要问题已经解决了。只要在guest command line加上”pci=nomsi”,其实VirtIO网卡已经可以使用了。

Stefano告诉我,下一步是把VirtIO disk for HVM搞好——这得等到Ian Jackson把libxenlight重构之后才可以进行,主要是一些配置的parse和driver的问题。

再下一步,就是把MSI for emulated devices做好。现在Xen只支持向passthrough的设备注入MSI,但对模拟设备没有接口——这也是为什么前面说要加nomsi的原因。

其实debug好像也没有什么好办法,都是加printk之类的。但是别人为什么debug这么快,我为什么这么慢,这真是个问题——我对底层还不是特别了解。以后还要再加油。

Written by liuw

May 19th, 2011 at 8:14 pm

Posted in Programming

Tagged with , , , ,

如何移植VirtIO到Xen的HVM上

with 4 comments

这只是一篇分析文章,没有具体代码。分析也不全面,可能会有错。权当自己的笔记。

把Anthony的QEMU-dm看了一下。以前写proposal的时候,主要看的是KVM的处理代码(kvm-all.c),马上开始干了,要先了解一下Xen的处理代码(xen-all.c)。两者的基本原理是一样的,只是一些命名和代码逻辑上有所区别。初步看来,这个阶段的工作难度相对来说不大,目前已经有比较清晰的思路了。

我在Porting VirtIO to Xen里面提到过,KVM的dispatch函数是kvm_cpu_exec(),它按照VMEXIT的原因去dispatch这些请求。Xen也必然会有自己的dispatcher。在xen_init()的最后,xen_vm_change_state_handler()被注册为QEMU的change state handler。顺着xen_vm_change_state_handler()又会注册cpu_handle_ioreq()作为event channel的handler。

cpu_handle_ioreq()会调用handle_buffered_iopage()和handle_ioreq()两个函数来处理IO请求。在handle_ioreq()中,又分为了几个IO类型。Xen的对它们的命名和KVM有所区别。但是简单对比一下就知道对应关系了。

1. KVM_EXIT_IO对应IOREQ_TYPE_PIO
2. KVM_EXIT_MMIO对应IOREQ_TYPE_COPY

VirtIO在向KVM注册设备的时候,用的就是直接写Virtual PCI的方法,也就是说会引发KVM_EXIT_MMIO。然后cpu_physical_memory_rw()会被调用,在这个函数里面,对IO mem的处理是由一些由用户自己注册的函数来完成的。

VirtIO的handler注册到IO mem区域是一个比较复杂的过程。主要涉及的文件有pci.c和exec.c。在pci.c中,有一个pci_update_mappings(),它会调用cpu_register_physical_memory();cpu_register_physical_memory()又调用cpu_register_physical_memory_offset()来更新IO mem的映射。这些handler在初始化的时候通过cpu_register_io_memory都注册到到对应的函数数组中,io_mem_write[]和io_mem_read[]。其他有关的数组还有io_mem_opaque[]。

io_mem_write[]的类型是CPUWriteMemoryFunc,io_mem_read[]的类型是CPUReadMemoryFunc,这两个类型和virtio_ioport_write()及virtio_ioport_read()的类型吻合。(有相应的typedef可查)

虽然我没有完全很详细了了解QEMU的设备注册机理,但是我初步猜测最后这些处理一定是会进入到virtio_ioport_wirte(),到实际写代码的时候会进行验证。

于是可以得出结论,注册到Xen的时候,控制逻辑应该会进入到IOREQ_TYPE_COPY中。同理,VirtIO在KVM中要触发事件(kick),也是直接写Virtual PCI的。最后的逻辑还是会进入同样的地方。现在如何注册、如何触发事件已经有思路了。实际上,这些控制逻辑要改动的地方不会太多。因为最后它们都会被dispatch到VirtIO的handler去处理。我们最切实的关注点应该是VirtIO底层的实现函数(比如说vp_notify())。目前的情况是,这些handler直接使用了KVM的的功能。据Stefano的说法,最好把这些函数换成QEMU-generic的函数,相当于再加一层glue,然后再分别为KVM和Xen实现底层的处理逻辑。

Virtual PCI要改吗?不用。就算分析到这里,我也没有对QEMU到底是怎么注册设备完全了解。但是按照目前的情况来看,了解到这里也基本够用了。我要关注的,还是VirtIO用到了哪些KVM的接口,这些是要真正着手修改的地方。Xen tools改动的地方不大。

我的打算是尽快开始写代码,没想到开发环境却怎么也没有完全搞好。先是VGA console不工作,搞了几天。Upstream QEMU和Xen QEMU不完全相同,也调了两天,现在是可以用了,但是也还是有问题。这些天解决了不少问题,但都不是核心问题,所以多少有点烦燥。对于时间的估计,我不会太乐观。貌似搞底层开发的不可预料的情况比较多,很令人烦燥。只能说尽快开始搞,多争取点时间了。

Written by liuw

May 6th, 2011 at 4:26 pm

Posted in Tech

Tagged with , , ,

Porting VirtIO to Xen

without comments

by Wei Liu <liuw #SPAMFREE# liuw #DOT# name>

1 Overview

VirtIO is a unified paravirtualized IO framework created by Rusty Russell. It’s not hypervisor-specified, but mainly used in KVM. It is possible to port VirtIO to Xen without much effort.

This article is organized as serveral sections. Section 1 discovers how VirtIO is used in KVM. I will pay much attention to code analysis. Section 2 discusses how we can port VirtIO to Xen, both for normal PV and PV-on-HVM. Section 3 illustrates what performance tests will be done. Section 4 introduces porting plan for Spice (spice-space.org).

2 How VirtIO is used in KVM

This topic can be divided into three subtopics.

2.1 The role of KVM

KVM acts as hypervisor. It is responsible for capturing events then passing events to QEMU. I’m not going to illustrate how it works because this is out of our scope.

2.2 How VirtIO is used in Linux kernel

There are two key perspectives in creating a cross-domain communication channel:

  1. how to deliver events, e.g. Xen’s event channel, KVM’s handler to trap VM_EXIT and event dispatcher.
  2. how to share data, e.g. Xen’s ring buffer, VirtIO’s virtqueue.

It is obvious that Xen provide these two perspectives out of the box. However, VirtIO only provides mechanism to data-sharing, and notification is left for the user. VirtIO also registers as a bus inside kernel, which resembles XenBus.

The core structure in VirtIO is virtqueue. It contains a vring (just like ring buffer) and some other information. One thing worth mentioning is that virtqueue alse wraps up two important function pointers, one for notification, which is exactly what we need to replace, the other is for callback, which is somewhat irrelevant to porting.

Let’s take virtio network device as an example.

First thing first, virtio network in Linux kernel is implemented as a PCI device, so it is necessary to implement a virtio pci bus. See drivers/virtio/virtio_pci.c for details. When porting to Xen, it might be necessary to replace this PCI bus with XenBus. (However, it might be left unchanged in PV-on-HVM, we need to do a VM_EXIT and trap into hypervisor anyway.)

Source of virtio network lies in drivers/virtio_net.c . virtnet_probe() is responsible for probing. In this function, virtnet is setup with at least two vrings “input” and “output” and an optional vring for “control”. Callback for “input” is skb_recv_done() and callback for “output” is skb_xmit_done() .

After calling vdev->config->find_vqs(), these 2 or 3 vrings are setup. If we trace down this function – it lies in virtio_pci.c as vp_find_vqs() – we can find that it consequently calls vp_try_to_find_vqs(), setup_vq() and request_irq() .

In setup_vq(), the framework actually allocates the vring for data sharing. It is worth noting that the notify function is vp_notify(), which directly writes queue_index to (vp_dev->ioaddr+VIRTIO_PCI_QUEUE_NOTIFY) to generate a VM_EXIT. So that hypervisor can catch the event and dispatch it.

And the requested irqs are used to invoke the callback functions, i.e. skb_xmit_done() and skb_recv_done() .

2.3 How VirtIO is used in QEMU

QEMU runs on top of KVM and it interacts with KVM via /dev/kvm . KVM has to cooperate with QEMU. Actually, a virtual machine in KVM is merely a process.

VM instructions execute natively on CPU. However, when a VM executes some sensitive instruction, it will be trapped by KVM. Then KVM passes this instruction to QEMU to emulate / handle it.

KVM hands over the instruction to QEMU in kvm_cpu_exec(), which is in kvm-all.c . There is a `switch` on the exit_reason. Exit reasons include (port) IO, interrupt and MMIO, etc.

QEMU has full access of guest’s memory. But it has to grab the virtqueue inside VM first to communicate with vritio_net. When VM calls setup_vq(), it voluntarily writes virtqueue’s address to (vp_dev->ioaddr+VIRTIO_PCI_QUEUE_PFN), which will be trapped by KVM. KVM passes it to QEMU. QEMU then calls virtio_ioport_write() -> virtio_queue_set_addr() to set VRing, which is the control structure used in VirtIO in QEMU. This is how QEMU and VM create their data-sharing channel.

As for notification channel, it seems much easier. As mentioned above, VM writes index to VIRTIO_PCI_QUEUE_NOTIFY. It is trapped by virtio_ioport_write(), then passed to virtio_queue_notify(). In virtio net’s case, this request is finally handled by virtio_net_handle_rx() or virtio_net_handle_tx_{timer,bh}() .

There are many VirtIO-related files reside in the hw/ subdirectory. And also many other files in Linux kernel’s directory.

3 How to port VirtIO to Xen

Now that we’ve got our first impression on VirtIO. It time to discuss how we can port it to Xen.

3.1 PV-on-HVM

It is obvious that in the PV-on-HVM case, things are more or less the same as they are in KVM. Xen traps VM_EXIT and passes exception to QEMU. QEMU emulates. Then Xen sends result, VM resumes running.

3.1.1 How to grab virtqueue address

Xen utilizes QEMU as KVM does. So Linux kernel can stay untouched. Xen captures guest’s write to PCI configuration space, then QEMU will handle changes in VirtIO configuration.

3.1.2 How to deliver event

In VirtIO’s current virtual PCI implementation ($QEMU/hw/virtio-pic.c), it uses KVM’s event notification functions like kvm_set_ioeventfd_pio_word() and kvm_has_many_ioeventfds(). It is necessary to replace them with corresponding implementation in Xen. Anthony’s QEMU-dm ships with xen-all.c , it might give me some hints on implementation.

3.2 Normal PV

When it comes to normal PV case, we will have to do things in different way. I will try my best to detail what should be done and how it is done. However, this is just a rough design, things may change when implementing.

3.2.1 How to grab virtqueue address

In Xen’s covention, it is common for Dom0 / DomU to use Xenstore to expose their public information like netfront/netback, blkfront/blkback, etc. So it is a good idea to use Xenstore to expose VirtIO information. Xenstore’s well-defined API will greatly reduce work needed.

Down to implementation level, it is necessary to replace virtual PCI bus with XenBus. VirtIO utilizes virtual PCI to configure network device. However, in normal PV case, it is not necessary to expose a virtual PCI device to VM. We can follow the pattern how netfront and netback establish their channel.

QEMU-dm from Anthony has functions to manipulate Xenstore, that should help a lot. It also has xen_nic.c, which can greatly inspire how I can implement a VirtIO network for Xen.

3.2.2 How to deliver event

No doubt that event channel is the best choice. Anthony’s QEMU-dm contains a file named xen_backend.c, which is used for event handling. Linux kernel has event channel handling functions, too. (drivers/xen/{events,evtchn}.c)

So, just replace any notification-related function with Xen’s implementation. That’s the plan.

3.3 Other stuff

In PV-on-HVM case, QEMU needs to emulate. In normal PV case, no emulation is needed. Either case, QEMU works as backend dispatcher for VirtIO. Once the channel between two VMs are established, QEMU is supposed to work out of the box. However, I can’t be too optimistic here, it might require some work, such as rewriting and debugging some functions.

3.4 Knowledge required

To be honest, QEMU’s concept (like proxy / virtual device management) is somewhat strange to me. There are some high level design document, but they are just too high-level. A thorough understanding of QEMU’s internal is required.

Knowledge of hareware virtualization is also required. I need to understand how Xen implements HVM interface and choose the right function for certain functionality.

Knowledge of XenBus configuration is a must in normal PV porting. I’ve read about it before, so this is the easier part.

4 Performance tests

Performance tests will be run with industrial standard software like kernbench, ioperf and netperf. Testsuits will be run on several different configurations:

  • Native Linux, CPU, disk and network.
  • Xen with normal PV VirtIO support, CPU, disk and network.
  • Xen with PV-on-HVM VirtIO support, CPU, disk and network.
  • Xen with original PV support, CPU, disk and network.
  • KVM with VirtIO support, CPU, disk and network.

And a short report will be written based on the result, which compares between outcoming data and analyzes advantages / disadvantages between configurations.

5 Porting of Spice

Spice will be ported to Xen’s HVM environment as a real-world testsuit. According to its design, Spice communicates with QEMU via Virtual Device Interface (VDI). Spice client and server run entirely in userland (correct me if I’m wrong, I’m not Spice expert). If we are able to run QEMU with QXL or any other VirtIO devices on Xen, it would not be so hard to get Spice running on Xen.

AFAIK, QXL in QEMU (hw/qxl.c) uses its own paravirtualized ring implementation. It also use qemu_set_irq() to deliver event. So the main idea is to replace this implementation with Xen’s ones, which is already done in porting VirtIO.

The plan is to run Spice with our modified QEMU and eliminate any bugs encountered.

6 Reference

  • Linux kernel 2.6.38.2
  • QEMU-dm, git://xenbits.xen.org/people/aperard/qemu-dm.git
  • Xen-unstable, git://xenbits.xen.org/xen-unstable.git
  • Spice project, spice-space.org

HTML generated by org-mode 6.21b in emacs 23


Written by liuw

April 26th, 2011 at 10:38 pm

Posted in Programming

Tagged with , ,

Xen可写页表方式的变化

without comments

以前看Paper的时候,总说Xen的可写页表机制是如下操作的:

1. DomU写L1页表,由于L1页表被Xen设置了只读位,所以会引发page fault;
2. Xen捕获page fault,然后把L1页表从进程页表中脱离下来,并且设置可写;
3. DomU自行更新页表;
4. Xen检查更新情况,合法则把L1页表从新挂回进程页表中;
5. DomU正常执行。

但是现在我们要自己写Paper了,却发现已经不是那么一回事了。我们用的版本是Xen-3.3.0,操作方法如下:

1. DomU写L1页表,由于L1页表被Xen设置了只读位,所以会引发page fault;
2. Xen捕获page fault,并且为DomU模拟出写操作;
3. DomU正常执行。

前面一个方式,可以减少陷入的次数,速度和效率相对有保证,内核代码也不用改那么多。

后面一个方式,慢是慢了点,但是对于少量页表更新还是可以接受的;大量的更新应该显式地发Hypercall。

具体也不知道在什么版本开始改成这样了。还好写的的时候自己注意了一下,不然真会出笑话了。做事还是较真点好。

Written by liuw

March 13th, 2011 at 12:28 pm

Posted in Tech

Tagged with , , ,