PCIe Bifurcation – What is it? How to enable? Optimal Configurations and use cases for NVMe SDDs/GPUs
One of my colleagues asked me about “Bifurcation” when it came in a discussion about running multiple NVMe drives from a single PCIe slot. As I explained “what it is” and why one should consider it before making a motherboard purchase, I shared my own home lab experience – which brought me to write this blog for the wider community.
Background: I had multiple disks in my Supermicro server – three(3) NVMe SSD, two(2) SSDs and two(2) HDDs. One of my SATA HDDs (2TB in size) decided to go kaput recently. I thought of replacing it with an equivalent size internal SATA SDDs to keep the expense low. Checking pricing on Amazon, 2TB SSDs were somewhere in the range of £168 to £200 and somehow I stumbled upon the WD Blue SN550 NVMe SSD 1TB for just £95 – I remember when I bought my first Samsung EVO 960 1Tb NVMe for about £435 in September 2017 and the same brand/model for 2TBs were around a grand!
The read and write speed of 545 MB/s and 540 MB/s respectively, of the fastest 2TB SATA SSD i.e. SanDisk SSD Plus (in the price bracket mentioned above) was no comparison to the WD Blue SN550 NVMe SSD’s 2400 MB/s and 1950 MB/s respectively. As I was already using 3 NVMe SSDs, I loved my nested VMware vSphere (ESXi) home lab run “smooth as butter”, it became a no brainer for me to consider buying two(2) x 1TBs WD NVMe SSD instead. Of course, I would have to invest additional money to buy more NVMe PCIe adapters, but I would rather spend a little more money now for future proofing my home lab/server along with the additional speedbump ;).
The PCIe based peripherals in my home lab are Supermicro Quad Port NIC card, Nvidia GPU and Samsung NVMe SSDs. This blog will focus on NVMe SSDs and the GPU as an example, covering the following:
- What is PCIe Bifurcation?
- Interpretation of an example motherboard layout and its architecture
- Limitations of the example motherboard
- Understand default PCIe slot behaviour with Dual NVMe PCIe adapter
- How to enable PCIe Bifurcation?
- Optimal PCIe Bifurcation configurations – three(3) use cases
Before we begin, let’s get the dictionary definition out of the way:
What is PCIe Bifurcation?
PCIe bifurcation is no different to the definition i.e. dividing the PCIe slot in smaller chunks/branches. Example, a PCIe x8 card slot could be bifurcated into two(2) x4 chunks or a PCIe x16 into four(4) x4 i.e. x4x4x4x4 OR two(2) x8 i.e. x8x8 OR one(1) x8 and two(2) x4 i.e. x8x4x4 / x4x4x8 (if it does not make sense now, it will later – keep reading )
Note: PCIe Bifurcation does not decrease speed but only splits/bifurcate lanes. In order to use bifurcation, the motherboard should support it and if it does then the BIOS should support it as well.
When I bought the Supermicro X10SRH-CLN4F motherboard in September 2017, it came with BIOS 2.0b, which did not have any “Bifurcation” options and as a result when I plugged in my a Supermicro AOC-SLG3-2M2 (Dual NVMe PCIe Adapter) in any slot, it would only detect one(1) of the two(2) NVMe SSDs installed. To get the card to detect both the NVMe SSDs, “PCIe Bifurcation” was required which was available in a later BIOS version not publicly available (at the time) but the supermicro support was great and the engineer shared it with me before it went GA.
Ok, let’s take the example of my motherboard (Supermicro X10SRH-CLN4F) layout below:
It has Six(6) physical PCIe slots – labelled as slot 2,3,4,5,6 and 7 respectively. However, the CPU socket only has three(3) PCIe 3.0 links and one(1) PCIe 2.0 via DMI2/PCH (Platform Controller Hub). They are numbered as 1, 2 or 3, followed by a letter (shown in the block diagram architecture below):
Interpretation of motherboard layout and its architecture:
- CPU/PCIe Link 1: Port 1A – used for the LSI SAS3008 I/O controller
- CPU/PCIe Link 2: Port 2A, and 2C – Link 2 is PCIe 3.0 x16 which is split between slot5 and slot6 as x8 lanes each (despite the physical slot6 of x16 size).
- CPU/PCIe Link 3: Port 3A, 3C and 3D – Link 3 is also a PCIe 3.0 x16 which is split between slot4, slot7 and LAN i350 as x8, x4 and x4 lanes respectively (despite the physical slot7 of x8 size).
- DMI2 – used for slot2 and slot3 via the PCH (platform controller hub)
PCIe 2.0 x4 for slot2 (despite physical slot size of x8)
PCIe 2.0 x2 for slot3 (despite physical slot size of x4)
I created the following table for easier understanding (you could do the same for your motherboard):
PCIe Slot Number
CPU/PCIe Port
PCIe version
PCIe Slot Size
PCIe Lanes
Limitations of the motherboard:
This motherboard restricts the use of “Quad NVMe PCIe Adapter” (not that I have a requirement for it…yet) due to the lack of dedicated PCIe x16 lane, but I can use a maximum of three(3) “Dual NVMe PCIe Adapters” because of three(3) x8 PCIe lanes available and two(2) more “Single NVMe PCIe Adapters” using the remaining two(2) x4 PCIe lanes, if needed.
Supermicro X10SRH-CLN4F motherboard has been running pretty sweet for me so far and it will suffice my current estimated requirements for future PCIe storage expansion. However, if you are in the market for buying any new motherboard and intend to run several PCIe based peripherals (including GPUs) – consider the limitations before you make the purchase.
Understand default PCIe slot behaviour with Dual NVMe PCIe adapter:
Ok, lets now talk about the “Dual NVMe PCIe adapter” e.g. Supermicro AOC-SLG3-2M2 (or any other) which requires a PCIe x8 lanes:
- If one(1) SSD is installed in the “Dual NVMe PCIe adapter” and the adapter is plugged in any PCIe slot (except slot3 which has only x2 PCIe lanes) – NVMe SSD will get detected.
- If two(2) SSDs are installed in a “Dual NVMe PCIe adapter” and the adapter is plugged in either PCIe slot2 or slot7 – only one(1) NVMe SSD will get detected.
- If two(2) SSDs are installed in a “Dual NVMe PCIe adapter” and the adapter is plugged in either PCIe slot4,5 and 6 – again only one(1) NVMe SSD will get detected.
The last option above, is the only option capable of detecting two(2) NVMe SSDs installed in a “Dual NVMe PCIe adapter” as the PCIe slots have x8 PCIe lanes available and is here, where bifurcation comes into the picture.
How to enable PCIe Bifurcation?
As mentioned before the motherboard should be compatible and BIOS should also have an option to enable it. You would need to dig the bifurcation options for your motherboard in BIOS settings, for Supermicro X10SRH-CLN4F (BIOS v3.2 Build 11/22/2019) the settings are located below:
BIOS -> Advanced -> Chipset Configuration -> North Bridge -> IIO Configuration -> II01 Configuration:
Optimal PCIe Bifurcation Configuration – Use case 1:
If the “Dual NVMe PCIe Adapter” is plugged into PCIe Slot4, “IOU1 (IIO1 PCIe Port 3)” config would need to be changed from “Auto” to “x4x4x4x4”, which will result in the PCIe v3.0 Link3 split/bifurcate into four(4) x4 chunks and the table will now look like:
PCIe Slot Number | CPU/PCIe Port | PCIe version | PCIe Slot Size | PCIe Lanes |
2 | DMI2 | 2.0 | x8 | x4 |
3 | DMI2 | 2.0 | x4 | x2 |
4 | 3A | 3.0 | x8 | x4x4 |
5 | 2A | 3.0 | x8 | x4 |
6 | 2C | 3.0 | x16 | x8 |
7 | 3C | 3.0 | x8 | x4 |
Note: As I explained in the “Interpretation of motherboard layout and its architecture” section, the CPU/PCIe Link 3 has three(3) ports i.e. CPU/PCIe Port 3A, 3C and 3D. CPU/PCIe Port 3A is the only port that is affected with this config change, which now splits/bifurcates it from x8 to x4x4 and as a outcome will detect both the NVMe SSDs. The remaining CPU/PCIe Port 3C and 3D remain unaffected as they were already using x4 lanes.
Optimal PCIe Bifurcation Configuration – Use case 2:
If the “Dual NVMe PCIe Adapter” is plugged into PCIe slot5, “IOU1 (IIO1 PCIe Port 2)” config would need to be changed from “Auto” to “x4x4x4x4” instead, which will result in the PCIe v3.0 Link2 split into four(4) x4 chunks and the table will then look like this:
PCIe Slot Number | CPU/PCIe Port | PCIe version | PCIe Slot Size | PCIe Lanes |
2 | DMI2 | 2.0 | x8 | x4 |
3 | DMI2 | 2.0 | x4 | x2 |
4 | 3A | 3.0 | x8 | x8 |
5 | 2A | 3.0 | x8 | x4x4 |
6 | 2C | 3.0 | x16 | x4x4 |
7 | 3C | 3.0 | x8 | x4 |
Optimal PCIe Bifurcation Configuration – Use case 3 (my use case):
If you have a GPU (which I do i.e. Nvidia 1080Ti installed in PCIe slot6) along with the multiple PCIe based NVMe SSDs, the optimal configuration to get peak performance from the PCIe slot6 i.e. all x8 lanes and successfully detecting two(2) NVMe SSDs in PCIe slot 5 would be, to change “IOU1 (IIO1 PCIe Port 2)” config from “Auto” to “x4x4x8” , and to detect another set of two(2) NVMe SSDs in PCIe slot4, “IOU1 (IIO1 PCIe Port 3)” would also need to be changed from “Auto” to “x4x4x4x4”. The table will now look like:
Peripherals
attached
I have plans to install three(3) more NVMe SSDs in the next couple of weeks i.e. two(2) x 1TB to replace my failed 2TB HDD and another 2TB for future prospects (possibly all three of 2TB sizes if there are any deals on the upcoming Amazon Prime Day this months ).
Hope this helps in making your purchase decision or helps understand your existing motherboard architecture’s and its PCIe bifurcation configurations.
Рассматриваем возможность подключения двух видеокарт к одному слоту — по видеоматериалам канала TechQuickie
Бифуркация. Это причудливое слово, которое означает разделить что-то на две части, и сегодня мы поговорим о раздвоении чего-то, что не является сандвичем, спортивным событием или браком. Этот пост посвящен бифуркации PCI Express слота.
Это именно то, на что это похоже: взять слот PCI Express и поделить его, чтобы мы могли использовать несколько устройств. Но как это работает? Я имею в виду, вы не можете просто засунуть две видеокарты в один слот.
Чтобы ответить, мы обратились к Филиппу Махеру из TYAN,
реклама
и мы хотели бы поблагодарить его за его вклад. Нужно найти где и как настроены линии PCI Express на современных платформах. Ваш ЦП имеет определенное количество контроллеров, и каждый из них может поддерживать только одно устройство. Например, несмотря на то, что текущий процессор Intel поддерживает 16 линий, он разделен на четыре контроллера,
управляющих четырьмя линиями в каждой, что означает, что вы можете подключить максимум четыре устройства PCI Express. Конечно, вы можете подключить одну видеокарту и использовать все 16 линий и закончить на этом.
Но допустим, что вы хотите использовать несколько устройств хранения данных на PCI Express, например твердотельные накопители NVMe. Вот тут-то и нужна бифуркация. Видите ли, одно из наиболее распространенных применений бифуркации — это когда вы хотите подключить несколько дисков M.2 к одному слоту PCI Express. Это можно сделать с помощью необычной карты,
которая выглядит примерно так, но вот загвоздка. Если вы собираетесь разделить только один слот, вам нужно указать материнской плате внутри BIOS, как именно разделить эти линии.
В противном случае, он просто увидит одно из этих четырех устройств, и вы пожалеете, что потратили деньги. Как именно вы разделите эти дорожки, зависит от того, какую платформу вы используете. Как уже упоминалось ранее, частое что вы можете сделать на основной платформе Intel, — это четыре устройства по четыре полосы в каждом. Но на серверной платформе, такой как AMD EPYC, вы можете получить что-то около восьми устройств, хотя бы с двумя полосами на каждой, если вы используете так много устройств. Бифуркация встречается гораздо чаще в корпоративных условиях,
поскольку центры обработки данных, как правило, используют линии PCE Express для вещей, которых вы просто не найдете дома, таких как
ПЛИС и ИССН, микросхемы и устройства, которые можно настроить для работы над очень специфичными операциями. Но это не значит, что у бифуркации нет варианта использования раздвоения PCI Express для дома, к примеру при запуске двух видеокарт в одном слоте. Хотя для геймеров характерно использование только одной видеокарты на скорости x16, использование карты на x8 практически не повлияет на вашу производительность, поскольку даже современные высококачественные карты не перемещают достаточно данных, чтобы действительно нуждаться в этих дополнительных дорожках. Но, если у вас есть материнская плата с двумя свободными полноразмерными слотами PCI Express, дорожки автоматически разделяются переключателем, называемым мультиплексором, так что каждая карта работает на скорости x8.
Но на самом деле ничто не мешает вам вставить карту с двумя слотами PCI Express x16,
подключить две видеокарты с помощью переходных кабелей и попросить BIOS разделить слот на два соединения x8. Просто подумайте, что вы можете запустить NVLink на плате Mini-ITX с одним слотом, если вы найдете способ поместить карты в вашем корпусе, так что, надеюсь, вы не против использовать клейкую ленту.
Но если вы заинтересованы в использовании раздвоения PCI Express, независимо от того, что именно вы будете подключать, вам необходимо убедиться, что ваша материнская плата и BIOS поддерживают его.
Не все материнские платы позволяют включать его в BIOS, и если это так, то это в значительной степени ограничивает шоу. Так что проведите своё исследование, прежде чем принимать решение о превращении своей домашней установки в какой-то сумасшедший научный эксперимент.
Delightly Linux
PCIe Bifurcation and NVMe RAID in Linux Part 1: The Hardware
April 8, 2023 “PCIe Bifur…..WHAT?!”
Ooooh! Sounds fancy, right? On some motherboards, the BIOS will allow a single physical PCIe x16 slot to be divided into two or more logical PCIe slots in order to install multiple NVMe SSDs (two, three, or four) using an adapter card. This is PCIe bifurcation, and Linux is compatible with motherboards that support it.
What would benchmark numbers look like if we put two NVMe devices in RAID-0? How about RAID-1? How well would it compare to a single NVMe? What would be the best data storage arrangement if using NVMe? Are there different techniques to follow compared to RAID with mechanical drives?
Here are my experiments in an attempt to help protect data stored on a Linux system with the hopes of providing faster redundancy while exploring PCIe bifurcation on a system running Ubuntu Cinnamon 22.04.
PCIe Bifurcation allows you to install two M-key NVMe devices in a single PCIe slot for storage expansion, LVM, or RAID. Each NVMe device has its own x4 dedicated lanes for maximum speed, limited only by the PCIe slot it is connected to.
An example of the kind of ideal RAID-0 performance we can expect with two SN770 NVMe devices. This is a synthetic benchmark that shows the absolute maximum speeds possibles in a PCIe Gen3.0 slot utilizing PCIe bifurcation. However, do not be moved by benchmark numbers. Real-world performance is more mundane.
Why Not Just Use Motherboard M.2 Slots?
In my case, the entire technique requires PCIe bifurcation because the M.2 slots on the motherboard are inadequate for what I want to achieve: RAID-1 using two M-key NVMe devices.
“If you have two M.2 slots, why not just RAID them together?”
Reason 1. The primary M.2 slot is reserved for the OS NVMe. This is the slot near the CPU, and, as good practice, it should be dedicated to hosting the OS and nothing but the OS for optimal system performance.
Reason 2. The motherboard in the test system I am using is an older X470 that has a single PCIe 3.0 x4 M.2 slot connected to the CPU, and a secondary PCIe 3.0 x2 M.2 slot hosted by the chipset. Both are M-key slots.
Did you catch that limitation?
The second slot only has two PCIe lanes, not four. An NVMe installed in this slot maxes out at PCIe 2.0 x4 speeds. If these were RAIDed together, performance would not be optimal. Adding encryption reduces performance to begin with, so this arrangement would degrade performance even further when using RAID. Not good.
The solution? PCIe bifurcation!
Since the motherboard M.2 slots are out, the only other (inexpensive) option is to implement PCIe bifurcation. By using an additional dual NVMe adapter card that plugs into a bifurcated slot, it provides two dedicated PCIe 3.0 x4 M.2 slots. It also turns out that the test motherboard I have happens to support it.
What Is PCIe Bifurcation?
To bifurcate means to divide into two parts. In the X470 motherboard I am using for testing, only one PCIe 3.0 x16 supports PCIe bifurcation, and it must be enabled in BIOS. It will not automatically turn itself on. You must specifically enable it by entering BIOS. This changes the behavior of the x16 slot.
How the PCIe x16 slot is bifurcated on this particular motherboard. Bifurcation takes the x8 half of the slot and creates two virtual x4 slots. The result is referred to as a x4/x4 arrangement as shown here.
Not just any motherboard will work. You cannot take any existing motherboard PCIe slot, pop in a dual NVMe adapter, and expect it to work. It will not. A motherboard must specifically support PCIe bifurcation, and you will need to research and check the motherboard manual in great detail because this is often a hidden feature that does not receive attention.
PCIe Bifurcation Gotchas
PCIe bifurcation opens up new possibilities but also a few caveats.
Must have a compatible motherboard
Check the user manual online. Different vendors might call it something else. In my case, it was labeled “NVMe RAID,” not PCIe bifurcation, but it changed the physical x16 slot with x8 lanes (x8 electrical conductors while the rest of the x16 slot had no contacts) into a split x4/x4 as PCIe bifurcation should do. This means that an x8 dual NVMe adapter would work.
Must use a dual NVMe adapter that supports PCIe bifurcation
This is a separate purchase and not included with the motherboard (in my case). If you search for PCIe NVMe adapter cards, there are four kinds to be aware of. Only one will work with PCIe bifurcation.
-
Single PCIe to NVMe adapter card. Does not work. This allows one NVMe device to be installed in any PCIe slot matching the card lanes or more. It is a simple adapter card and should be compatible with any motherboard including older boards lacking PCIe bifurcation support. They are inexpensive, so if all you need is a single, extra NVMe, this is a good solution. You can even use two of them, each in its own PCIe slot, for RAID or LVM. Works great, but I do not have the extra PCIe slots. I only have one, so this option is out.
A good all-round performer that I have used is this single M-key M.2 slot to PCIe adapter card. It delivers four PCIe lanes to the M-key slot for full performance. You would need two of these, one in each PCIe slot, to support NVMe RAID or LVM without PCIe bifurcation. In my test system, I did not have two available PCIe slots.
Even though two M.2 slots are present on this dual M.2 PCIe adapter card, one is B-key (slower SATA) and the other is M-key (faster NVMe). This will not work with PCIe bifurcation and RAID.
This quad NVMe adapter card allows any motherboard to enjoy four M-key NVMe devices in a single PCIe slot…but it is expensive. Why not just purchase a new motherboard at this price point?
This dual M.2 NVMe adapter from 10GTek is the one I used. It supports two M-key NVMe SSDs, each with a full four lanes of PCIe 3.0 or 4.0 goodness. Its cost is low, but it requires a motherboard and BIOS that supports PCIe bifurcation.
Must be supported in BIOS
BIOS must have an option that allows you to turn PCIe bifurcation on or off. By default, it is disabled, so look through your BIOS or motherboard manual to find advanced PCIe options. If your motherboard supports PCIe bifurcation, then there will be an option for this in BIOS.
Limited PCIe Slot
You cannot pick and choose which PCIe slot on the motherboard to use PCIe bifurcation. This is predetermined by your motherboard. In my case, PCIe bifurcation was only supported on the second PCIe 3.0 x16 slot (x8 lanes). The first PCIe x16 slot near the CPU could not be bifurcated. Only the second one.
Other motherboards might vary. Shown here is a secondary PCIe 3.0 x16 slot.
Preset Bifurcation
The BIOS also dictates how the PCI slot can be bifurcated. You do not get to choose x4/x4 or x8/x8 or x4/x4/x4/x4, for example. In my case with my test system, only the second PCIe slot could be bifurcated, and even then it would only allow x4/x4 mode.
How Bifurcation Works
The motherboard on the test system limited PCIe bifurcation to this PCIe 3.0 x16 slot only. It might x16 in size, but it is only an x8 lane slot.
This is why I needed to use an x8 PCIe dual NVMe adapter card. Only the first 8 lanes are relevant. The BIOS divides this slot into an x4/x4 arrangement. Both virtual slots are still PCIe 3.0 x4 each.
This dual NVMe adapter card needs a PCIe x8 slot, but it still fits in the x16 slot shown above. Notice the circuit traces? The eight lanes from the PCIe x8 slot are split into four lanes so each M.2 slot has its own x4 connection. This is the way to go for maximum performance from both NVMe devices. The PCIe x16 (x8 electrical) slot this connected to could only be bifurcated into x4/x4, which was perfect. On another motherboard where the PCIe x16 slot could only be bifurcated into x8/x8, this card would not work. Only one NVMe was recognized. Why? The first x8 out of the two x8/x8 was treated as a single M.2 slot, thus, this card was seen as a single NVMe adapter card.
On a different motherboard, only the first PCIe x16 near the CPU slot could be bifurcated and then only into x8/x8 mode. There was no x4/x4/x4/x4 mode available, which allows four NVMe devices on a single adapter card. Again, check and double check the manual.
Linux Works Perfectly with PCIe Bifurcation
PCIe bifurcation is a hardware setting affecting the underlying hardware, so it is 100% compatible with Linux. There are no drivers to install for Linux or special modification necessary. You can take your existing Linux system, enable PCIe bifurcation, and Linux will recognize it without issue with 100% compatibility. Linux will see the NVMe devices as new drives added to the system. Ubuntu Cinnamon 22.04 performed every bit as good before enabling PCIe bifurcation.
Of course, I have not tried every Linux distribution, but from what I tested, everything worked out of the box with Linux.
The Hardware
Note: Nobody sponsors this. I found a project that I liked and wanted to share the results with others. Any links to Amazon are affiliate links to help readers locate the items and to help cover the time spent researching this article since I earn a commission on qualifying purchases at no extra cost to readers.
- Motherbaord and CPU: X470 (PCIe 3.0) + Ryzen 5 2600
- 2x WD Black SN770 2T NVMe
WD SN770 NVMe
My experiments will be conducted using two Western Digital Black SN770 NVMe devices. These are truly excellent NVMe SSDs on their own, so I want to see what performance would be like with RAID and PCIe bifurcation.
“If you have a PCIe 3.0 motherboard, why use SN770 PCIe Gen 4.0 NVMe?”
Three reasons: speed, price, and future upgrades.
Speed
I wanted to ensure that the NVMe was not the limiting factor. The SN770 NVMe is a speedy device delivering over 5000 MB/s reads in a PCIe Gen 4.0 slot, so it should be able to saturate a PCIe 3.0 x4 slot, which is backwards compatible. Indeed I benchmarked this, and a single SN770 maxes out what a PCIe 3.0 M.2 slot can allow. If I see low benchmark numbers, it will not be because of the SN770.
Price
I found the 2TB SN770 NVMe for the same price as a standard 2.5″ SATA SSD and slower PCIe 3.0 NVMe devices. If they all cost the same, why not buy the fastest of the group?
Future Upgrades
If I upgrade the motherboard in the future to one that supports PCIe 4.0 slots, then the SN770 will upgrade too without needing to purchase new NVMe devices and restoring the data. I can use the NVMe as-is with future builds. If I buy PCIe 3.0 NVMe, then I would certainly want to upgrade to PCIe 4.0 NVMe later. Why not get a reasonable PCIe 4.0 NVMe now and be ready?
Since these will be used in an existing PCIe motherboard, there is no need to buy the latest and greatest NVMe for twice the cost such as the Samsung 990 Pro. That would be overkill in a PCIe 3.0 system. A PCIe 3.0 M.2 slot is limited to about 3600-3800 MB/s real-world throughput, so the extra cost of a more expensive NVMe would be wasted unless I upgraded the motherboard. What I have works well, so I saw little reason to do that.
Dual M.2 NVMe PCIe Adapter Card
This card has two M.2 slots that only accept M-key NVMe devices. Four lanes from the x8 slot each connect to the M.2 slots so each NVMe can experience full x4 bandwidth.
Ventilation holes allow airflow. There are also two green LEDs, one per NVMe and visible through the holes, that blink during NVMe activity.
“What happens if you connect this into a non-bifurcated slot?” It will still work, but only one NVMe will be recognized by the system. Which NVMe out of the two might take some trial and error to determine.
NVMe Heat Sinks
The SN770s run painfully hot under load, so I installed a heat sink on each. This particular heat sink sandwiches the NVMe between thermal pads on the bottom and top. They still become warm, but nothing painfully hot. Screws are included with the heat sinks. The SN770 does not include a heat sink.
The Setup
With the research out of the way and the parts in hand, the first step is to prepare the dual adapter card.
Prepare the Adapter Card
Adapter card (left) with heat sinks installed on both SN770 NVMe devices. Top view and bottom view of heat sinks shown.
Both NVMe devices install easily and fasten in place using M.2 screws. This increases the weight of the card a little, but nothing to be concerned about.
Install in Computer
Installed in test system.
Enable PCIe Bifurcation in BIOS
Motherboards might vary, so check your manual. This option was buried in the Advanced\Onboard Devices Configuration menu. Nowhere does it read “PCIe Bifurcation.” Instead, it refers to it as “PCIe RAID Mode.” Same thing, so select this to switch the PCIe slot into x4/x4 operation. If not enabled, Linux will see only one of the two installed NVMe devices. The description in the box at the bottom explains what the setting does.
NOTE: The name “PCIe RAID Mode” shown in the BIOS is misleading. This is not hardware RAID, nor does it create a RAID array from within BIOS. Any NVMe RAID used on this board must be created as software RAID. All RAID was set up using mdadm in Linux.
Check Disks Utility in Ubuntu
After rebooting the system, open Disks to find out if Ubuntu Cinnamon 22.04 detects both NVMe devices.
Yes! Both SN770 NVMe SSDs are shown in the left pane of Disks. This means PCIe bifurcation is working.
Second NVMe successfully formatted as ext4.
Separate NVMe
Linux sees both NVMe devices as two separate and totally independent devices, and they will have their own device names. In this case, they are identified as /dev/nvme1n1 and /dev/nvme2n1, but this can vary depending upon any other NVMe devices installed in the system.
Treat them like any other storage devices on the system. LVM, RAID, single disks. It is up to your imagination at this point. They can be formatted with different file systems, placed in an NVMe RAID array, or set up as physical volumes in LVM. I tested these situations and they all work flawlessly with Linux.
Quick Benchmarks
Benchmarks will be covered in a separate article because there are a number of RAID-related surprises and performance issues when using LUKS or VeraCrypt full-disk encryption. But to test what these drives can do right now without any RAID or encryption, I ran Disks benchmark and KDiskMark to view the maximum potential possible using PCIe 3.0.
KDiskMark
Some quick tests comparing the KDiskMark standard preset with the NVMe preset. (Presets can be chosen from the KDiskMark menu.) No encryption used here. Both SN770 NVMe SSDs are on the dual adapter card using PCIe bifurcation.
The KDiskMark tests above show what we should expect from PCIe 3.0 x4 slots. Since each NVMe has its own four lanes (x4) and each SN770 has a theoretical maximum read speed of over 5000 MB/s, both can operate up to the limits of PCIe 3.0 x4 speeds. We know that the SN770s and the dual adapter card will not be the bottlenecks in future tests.
Disks 100x100M
First NVMe in adapter card. Disks shows that we are reaching about the best PCIe 3.0 x4 speeds possible. Graph performance is every bit as good as I was expecting.
Second NVMe in dual adapter card tested. Results are just as good as the first, so both M.2 slots operate at their full potential. From this, we know that both NVMe devices perform identically in the same dual adapter card no matter the M.2 slot.
Quick RAID-0 Test
Happy with these individual results, I had to perform a quick test to see what to expect when both SN770 devices are members of a RAID-0 array created with mdadm.
Wow! I had calculated about double speeds during synthetic benchmarks, but to actually see it in operation after careful research and planning makes this project worthwhile. Each SN770 is 2TB in size, so RAID-0 doubles the available space to 4TB. However, this is just for testing. I want to see how RAID-1 will work with LUKS encryption to better protect data. RAID-0 stripes data, so it does not offer any protection in the event that one NVMe fails or goes missing. As mentioned in the beginning, these are synthetic numbers, so avoid being dazzled. Real-world performance with encryption is another story.
RAID-0 is one way to break the limits of a single PCIe 3.0 x4 M.2 slot. This is because reads and writes involving RAID-0 occur simultaneously across the two NVMe devices.
Conclusion
After running the RAID-0 test, I was elated with the success of this experiment, so I had high hopes for RAID-1 and encryption. Well, well, well. It turns out that NVMe RAID (both RAID-0 and RAID-1) and encryption introduces its own set of issues that I was not expecting. RAID might have been ideal when using mechanical drives, but NVMe devices involve a different technology to wrestle with.
Unexpectedly, I discovered that, for what I wanted to do with this configuration, a single NVMe yielded better overall performance than RAID-0. Surprise! Because of this, I found myself wondering if RAID was even worthwhile anymore when dealing with NVMe, but we will look at these details in part 2 of this series.
Part 2 will test various benchmarks and NVMe arrangements using these two SN770 devices. Is it worth the extra effort?
PCI-e bifurcation explained
OK, some asked about ‘what is bifurcation’ from the previous post. Essentially, if you have a PCI-e x8 slot, you can split it in half and make it 2 x4 slots. If you have a x16, you can make it 1 x8 and 2×4, or 4×4.
You can see below, i’ve overlayed my BIOS setup on top of the motherboard diagram (here a SuperMicro x10DRi-LN4+). Now, if your BIOS doesn’t have a bifurc option, you can possibly get it to do so by adding support into the BIOS. I’m not going to help you with this, its very complex, but I was able to add both UEFI NVME boot and bifurcation to a different SuperMicro motherboard by adding the UEFI modules into it manually. YMMV. Void where prohibited.
So in my case, I have a NVME carrier which is capable of holding 4 NVME drives. It is passive (no PCI bridge is onboard). This means that it is conceptually 4 PCI-E x4 drives. Without bifurcation, it just won’t work. Some people refer to this as ‘pci splitting’. You may see references to this in the ‘crypto-mining’ industry, where people are using 1x interfaces via cables to mining ASIC.
Be careful here, bifurcation is supported on server motherboards with modern chipsets, but its support on desktops is not as universal. And just because your motherboard supports it doesn’t mean your BIOS will.
It may also have downstream affects on other PCI-e cards, e.g. reducing their lane-width. Caveat Emptor.
EDIT (2021-01-24)
There have been a lot of questions on this article since I published it. I’ll cover a few off here.