Play “Overcooked” efficiently

I got a Nintendo Switch from my friend (for a research project). Meanwhile, I enjoyed the game “Overcooked” on Switch. In this game, you control cooks to perform variety tasks and then deliver orders in time. If orders are delivered in advance, some tip will be given. 1-4 players can play the game simultaneously.

It’s clear that you have to do everything as quick as possible to achieve high score in the game. Every task (e.g. cutting meat) need some time to complete. Certain task (e.g. frying) depends on other tasks. To eliminate unnecessary time cost (i.e. waiting for cutting to complete), I use the following strategies:

  • Minimize workers’ stall time (doing nothing). For example, it is not necessary for workers to wait for frying process (polling is not efficient). Like interrupts in modern machines, they can do something else like washing dishes and cutting meats while waiting for cooking. Once interrupt signals (frying completes), they enter interrupt servicing routine: get the food put into a plate. In most cases, the food is ready to serve then. Finally, they returned to what they were doing.
  • Again, make sure everything is doing something. This is especially important if you are playing with your friends. You had better analyze the dependency chain and discuss strategies with your friend before starting the game. Of course you should issue instructions to your friend during game if necessary.

Not all kitchens are easy to deal with. Some have dynamic arrangements – contents may change their location during the game session. Some kitchens have no constant light source. Other have isolated workspaces with conveyor belts or tables for swapping materials (I call it a “bus”).

  • Tables for swapping are usually space-constrained. If you are playing with your friends, you are probably simulating a Symmetric Multiprocessing system, and the bottleneck will be bus bandwidth. In such cases, you should consider the priority of materials. Once transfer finishes, get them as soon as possible.
  • Conveyor belts are high-latency bus, but they have relatively high bandwidth (Hey DDR4, I am looking at you). In some kitchen scenarios (e.g. making burgers), you can put everything on the belt in batches and fetch in batches too.
  • Some conveyor belts connect to trash can, which means materials must be fetched before the expiration. But some cook utilities will appear again if you put them into the trash can. In this way, you can prioritize the transfer of contents on the conveyor belt.
  • Try achieve full-duplex transfer and prefetch to save time. Consider the following scenario: you have a pot that cooks rice at once side, and food materials (rice and flour tortilla) on the other side. For the first time, you get rice and put them into pot. Once rice finishes, you carry cooked rice to the other side and wrap them with tortilla. Don’t get tortilla separately in another transfer. If you really have to do that, you can instruct other cooks (if exist) to prefetch some for you.
  • Prefetch might not work for all kitchens. In the case of cooking soup, mice will steal your food if it is unattended for a while. But you can secure processed food in pots so it won’t get stolen.

Get familiar with your kitchen and good luck! (Well, it is a bit boring if you have learned Machine Architecture and Operating System internals).

The case of UEFI for Windows on ARM, and comparison with LK/ABoot

Nights before trips are always boring, and I decided to draft some words to spend the time. So we have Windows 10 on ARM running on Dragonboard 410c, and Lumia 950 XL (Article in Chinese, sorry). It will be helpful to write down some firmware-related information for platform bring-ups for further reference. Meanwhile, the comparison of Little Kernel, the common Linux Android (well, Qualcomm says so) bootloader will provide useful information for Android on Lumia project.

I recommended you read this article if you are not familiar with UEFI.

Assumptions, assumptions

Compared to Linux, Windows Kernel assumes its platform firmware and bootloader (aka. Windows Boot Manager) prepare the basic environment for successful kernel initializations. If certain components are not initialized, bugchecks may occur. Even the system successfully launches, it may have some unexpected behaviors (weird things). An official document explains these a lot.

Little Kernel initializes basic hardware too (at least you need serial for debugging). Certain periapical, including clocks, regulators, and USB are initialized too for application purposes (e.g. Fastboot). Anyway, it initializes less periapical as possible. Sometimes even the panel is not brought up (I’ve seen a case on Android phone).

In short, you have to do more for a successful Windows bring-up:

  • If you know certain components are in the usable state already, skip initialization procedures. For example, on Lumia 950 XL, our UEFI implementation does not need to initialize USB since our bootstrapper (Qualcomm UEFI) did so.
  • If your platform has PCIe components, clocks them up, set regulators and mappings, etc.
  • Initialize at least one debug resource described in your DBG2 table (if applicable, likely on all ARM platforms)
  • Bring up the panel, set basic display parameters and pass a framebuffer pointer for Windows.

So how about Linux? If your Linux platform uses DT instead of ACPI, you are likely not required to do most of the stuff Windows requires. On Qualcomm platforms, Linux kernel will clock up PCIe cores, set regulators and mappings to make it in the usable state. If your platform uses standard ACPI and platform drivers do not perform additional initialization procedures, initialize these components in firmware.

Fill the hole

Both UEFI w/ ACPI and LK will perform fix-up tasks before transferring control to the kernel. On Qualcomm platforms, chipset metadata (revision, foundry ID, etc.) will be filled in DSDT. Certain logic in DSDT depends on them. Typical Linux Android device will ship with a large DT for multiple variants. LK selects the best fit using chipset ID/PMIC ID/board ID, then fill in some memory region information for kernel use.

ACPI tables in the firmware for Windows 10 on ARM is pre-patched. So I don’t implement the fix-up logic additionally.

Multi-processor Startup, Again

Why am I discussing the thing again? Because it is important.

Little Kernel (and likely other Linux Android bootloaders) will only use a single processor in its lifecycle (a notable exception is Raspberry Pi, which uses spin table except 3+). When it transfers control to Linux, Linux will bring other cores out of reset state and make them available for use.

Windows platforms that implement ACPI Multi-Processor Parking Protocol behaves differently. Although firmware uses a single core, other CPU cores are brought out of the reset state and being instructed to run a special piece of code. The code flow is like this:

parking:
    Wait for an interrupt.
    Am I the processor being waked up?
    If yes, go to the address that OS told me
    If not, go back to parking.

(Interrupt acknowledgment and memory barriers ignored. Sorry, I don’t want to write assembly at 11 PM.)

Because different platforms handle core startup differently (on Qualcomm platforms, TrustZone has participated), booting Linux Kernel and starts cores the Linux way with a UEFI firmware that implements this protocol may fail. Someone told me he was unable to bring up other three cores on 640. It is reasonable since LK on recent Lumia phones is launched via a special UEFI application in Windows Boot Application form. Qualcomm UEFI put the other three cores in running state (and WFI). Both LK and Linux are not aware of that (they have the assumption of core state). Finally, core startup fails.

Since it is not possible to ditch Qualcomm UEFI (unlike the exploit for first-generation Lumia WP8 devices), we have to comfort the parking protocol in AArch32 mode (You have PSCI for AArch64 SoCs):

  • Ignore other cores Unicore is the best
  • Implement parking protocol for unsupported systems (not too hard). Linux has the protocol support; you have to enable it.
  • Go AArch64 and use PSCI (remember to use HVC mode for 8992/8994)

 

Good night (And to my girlfriend: If you see this article, sorry that I say “Good Night” too early.)