What's new

37 bad PEBs on an AXE16000

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Armand28

Occasional Visitor
I’ve had the router for a year, was going through the logs and found two entries last night where bad PEBs were flagged. I went to console and found there are 37 out of 40 reserved being taken up, meaning I only have 3 reserved PEBs left. Does the system allocate more or am I bricked when I hit the limit? The router is just outside of the warranty period, what do you think my options are?

UBI version: 1
Count of UBI devices: 1
UBI control device major/minor: 10:59
Present UBI devices: ubi0

ubi0
Volumes count: 9
Logical eraseblock size: 126976 bytes, 124.0 KiB
Total amount of logical eraseblocks: 1979 (251285504 bytes, 239.6 MiB)
Amount of available logical eraseblocks: 294 (37330944 bytes, 35.6 MiB)
Maximum count of volumes 128
Count of bad physical eraseblocks: 37
Count of reserved physical eraseblocks: 3

Current maximum erase counter value: 982
Minimum input/output unit size: 2048 bytes
Character device major/minor: 249:0
Present volumes: 1, 2, 3, 4, 5, 6, 10, 11, 13

GT-AXE16000-7B5C:/tmp/home/root# dmesg | grep -i 'bad'
ubi0: good PEBs: 1979, bad PEBs: 37, corrupted PEBs: 0
ubi0: available PEBs: 294, total reserved PEBs: 1685, PEBs reserved for bad PEB handling: 3
 
In the UBI FAQ and HOWTO, there is the FAQ section where one of the question is as follows:
Q: What happens when the PEBs reserved for bad block handling run out?
A: By default, about 2% of the whole chip size (20/1024 PEB) are reserved for bad blocks handling. If the number of blocks that turn bad exceeds that allocation, an error message will be printed and UBI will switch to read-only mode.
Note: If at attach time, there's already more bad blocks than reserved PEBs, UBI will stay in read-write mode. The switching to read-only mode only occurs when a new bad block appears.
However, it is worth paying attention to the dynamics of the appearance of bad blocks. I have read many times that the initial number of bad blocks does not correlate with the further reliability of the memory, so if these 37 bad blocks were there from the very beginning, then, I think, there is no need to worry. If they began to appear during the operation of the router, then, alas, when they run out, the router will most likely become inoperative.
 
Last night I had TWO of them:

Oct 15 22:00:16 kernel: ubi0 error: ubi_io_write: error -5 while writing 2048 bytes to PEB 1622:0, written 0 bytes
Oct 15 22:00:16 kernel: CPU: 1 PID: 119 Comm: ubi_bgt0d Tainted: P O 4.19.183 #1
Oct 15 22:00:16 kernel: Hardware name: GTAXE16000_2GB (DT)
Oct 15 22:00:16 kernel: Call trace:
Oct 15 22:00:16 kernel: dump_backtrace+0x0/0x150
Oct 15 22:00:16 kernel: show_stack+0x14/0x20
Oct 15 22:00:16 kernel: dump_stack+0x94/0xc4
Oct 15 22:00:16 kernel: ubi_io_write+0x574/0x690
Oct 15 22:00:16 kernel: ubi_io_write_ec_hdr+0xc4/0x110
Oct 15 22:00:16 kernel: sync_erase.isra.0+0x11c/0x1f0
Oct 15 22:00:16 kernel: __erase_worker+0x34/0x460
Oct 15 22:00:16 kernel: erase_worker+0x18/0x80
Oct 15 22:00:16 kernel: do_work+0x98/0x120
Oct 15 22:00:16 kernel: ubi_thread+0x108/0x190
Oct 15 22:00:16 kernel: kthread+0x118/0x150
Oct 15 22:00:16 kernel: ret_from_fork+0x10/0x24
Oct 15 22:00:16 kernel: ubi0: dumping 2048 bytes of data from PEB 1622, offset 0
Oct 15 22:00:16 kernel: ubi0 error: __erase_worker: failed to erase PEB 1622, error -5
Oct 15 22:00:16 kernel: ubi0: mark PEB 1622 as bad
Oct 15 22:00:16 kernel: ubi0: 4 PEBs left in the reserve
Oct 15 22:24:15 ddns: IP address, server and hostname have not changed since the last update.
Oct 15 22:54:15 ddns: IP address, server and hostname have not changed since the last update.
Oct 15 23:00:17 kernel: ubi0 error: ubi_io_write: error -5 while writing 2048 bytes to PEB 85:0, written 0 bytes
Oct 15 23:00:17 kernel: CPU: 1 PID: 119 Comm: ubi_bgt0d Tainted: P O 4.19.183 #1
Oct 15 23:00:17 kernel: Hardware name: GTAXE16000_2GB (DT)
Oct 15 23:00:17 kernel: Call trace:
Oct 15 23:00:17 kernel: dump_backtrace+0x0/0x150
Oct 15 23:00:17 kernel: show_stack+0x14/0x20
Oct 15 23:00:17 kernel: dump_stack+0x94/0xc4
Oct 15 23:00:17 kernel: ubi_io_write+0x574/0x690
Oct 15 23:00:17 kernel: ubi_io_write_ec_hdr+0xc4/0x110
Oct 15 23:00:17 kernel: sync_erase.isra.0+0x11c/0x1f0
Oct 15 23:00:17 kernel: __erase_worker+0x34/0x460
Oct 15 23:00:17 kernel: erase_worker+0x18/0x80
Oct 15 23:00:17 kernel: do_work+0x98/0x120
Oct 15 23:00:17 kernel: ubi_thread+0x108/0x190
Oct 15 23:00:17 kernel: kthread+0x118/0x150
Oct 15 23:00:17 kernel: ret_from_fork+0x10/0x24
Oct 15 23:00:17 kernel: ubi0: dumping 2048 bytes of data from PEB 85, offset 0
Oct 15 23:00:17 kernel: ubi0 error: __erase_worker: failed to erase PEB 85, error -5
Oct 15 23:00:17 kernel: ubi0: mark PEB 85 as bad
Oct 15 23:00:17 kernel: ubi0: 3 PEBs left in the reserve

Odd that both happened on the hour, 1 hour apart. I had traffic analyzer > Statistic turned on which logs stats hourly so having both errors happen on the hour makes me think that is what triggered it so I turned it off, but having that many bad PEBs makes me wonder if my memory is bad and that traffic analyzer just helped me notice it? I also moved Traffic Monitoring to store on an attached USB drive rather than RAM.

I figure I’m hosed so I ordered a BE98Pro and will use this as a repeater. Either way if I send my 16000 in for service I need a router so it sucks but not sure I have other options.
 
Last edited:
Looks like the memory is dying indeed; however, I'm not sure that ubi_bgt0d crash is normal. From the same FAQ:
Q: What does the "ubi_bgt0d" thread do?
A: The UBI back-ground thread is a per-UBI device thread which has "ubi_bgtXd" name, where "X" is the UBI device number. For example, "ubi_bgt0d" is a background thread corresponding to UBI device 0.
The UBI background thread is doing background physical eraseblock erasure. This is an important optimization which greatly improves UBI I/O throughput (applications do not have to wait for erasure completion). For example, UBI unmap operation schedules physical eraseblocks for erasure.
The background thread also tortures faulty physical eraseblocks.
The UBI background thread also moves data from more worn-out physical eraseblocks to less worn out, i.e., performs wear-leveling. It also moves data from physical eraseblocks which have bit-flips. See the UBI overview section for some more information.
Note, UBI may work without the background thread, so the thread is just an optimization, although a very important one.
This is just my guess, but could it be that the crash of the ubi_bgt0d process is causing the blocks it was working on (maybe it was performing wear-leveling) to be falsely marked as bad? Honestly, I'd try resetting the router to factory settings and see how it behaves.
 
Looks like the memory is dying indeed; however, I'm not sure that ubi_bgt0d crash is normal. From the same FAQ:

This is just my guess, but could it be that the crash of the ubi_bgt0d process is causing the blocks it was working on (maybe it was performing wear-leveling) to be falsely marked as bad? Honestly, I'd try resetting the router to factory settings and see how it behaves.
Does a factory reset reset all of the PEBs? If I backup my settings can I restore them or will that just restore the problem? I guess I don’t know what is backed up. Thanks!
 
Does a factory reset reset all of the PEBs? If I backup my settings can I restore them or will that just restore the problem? I guess I don’t know what is backed up. Thanks!
Unfortunately, the bad PEBs will remain, but if there is a problem with the ubi_bgt0d process, theoretically, resetting the router can prevent new bad PEBs from appearing (but, again, it is just my guess). For best results, I would not restore the settings from your current setup.
 
Last edited:
Tried a factory reset and still showed bad PEBs. I am working with support to see what my options are.
I turned off Traffic Analyzer stats in the meantime, hopefully the router doesn’t die before my replacement comes.
 
Last edited:
Depending on where you are and how much your local technician charges for the job, replacing flash memory can be very cheap and quick. In fact, you can do it yourself as long as you have a hot air rework station, solder, and desoldering braid, which are fairly cheap to buy. However, if you decide to go ahead you should practice using these tools before working on important chips.

Oh, depending on your chip, may also need reballing stencil...

------

Just had a look, don't need reballing for your flash memory (MXIC MX30LF1G18AD 2Gbit). If you are good at it, you might even be able to get away without using a hot air rework station. It should be super easy.
 
Last edited:

Similar threads

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top