Hard Disk wake up very frecuently

Re: Hard Disk wake up very frecuently

Postby Jocko » Sun Sep 20, 2020 8:33 am

Hi Mijself

I ran diskmon on my 5big2 this night (because I have also some spin down issue on the system disk)
And indeed I was noticing these amazing lines with diskmon:
Code: Select all
==== Sun Sep 20 00:40:01 CEST 2020 ====
[24452.405340] diskmon(10430): dirtied inode 176 (ld-linux.so.3) on sda2
[24452.405399] diskmon(10430): dirtied inode 126 (ld-2.17.so) on sda2
[24452.406715] awk(10430): dirtied inode 143 (libm.so.6) on sda2
[24452.406775] awk(10430): dirtied inode 186 (libm-2.17.so) on sda2
[24452.407257] awk(10430): dirtied inode 177 (libc.so.6) on sda2
[24452.407315] awk(10430): dirtied inode 149 (libc-2.17.so) on sda2
[24452.441991] diskmon(10431): dirtied inode 63 (busybox) on sda2
[24452.443351] dmesg(10431): dirtied inode 10833 (libtirpc.so.1) on sda2
[24452.443413] dmesg(10431): dirtied inode 10257 (libtirpc.so.1.0.10) on sda2
[24452.444003] dmesg(10431): dirtied inode 146 (libnsl.so.1) on sda2
[24452.444061] dmesg(10431): dirtied inode 198 (libnsl-2.17.so) on sda2
[24452.444365] dmesg(10431): dirtied inode 180 (libpthread.so.0) on sda2
[24452.444413] dmesg(10431): dirtied inode 130 (libpthread-2.17.so) on sda2
...
[25653.174654] sleep(20297): dirtied inode 178 (librt.so.1) on sda2
[25653.174720] sleep(20297): dirtied inode 172 (librt-2.17.so) on sda2
[25654.278750] diskmon(20306): dirtied inode 70 (date) on sda2
...
[26252.913683] diskmon(25222): dirtied inode 11180 (awk) on sda2
[26252.915790] diskmon(25224): dirtied inode 11078 (gawk) on sda2
[26252.917478] awk(25224): dirtied inode 10282 (libgcc_s.so.1) on sda2
Later these outputs are no longer present

Thank you for your explanation :thumbup about dirty inode because up to now I do not understand why we got that and indeed the firmware partition is mounted with the option relatime :pound
Code: Select all
root@Acrab:/ # mount
/dev/sda2 on / type ext3 (rw,relatime)                                        <=============== here
none on /proc type proc (rw,relatime)
none on /sys type sysfs (rw,relatime)
none on /dev/pts type devpts (rw,relatime,mode=600,ptmxmode=000)
/dev/sda5 on /rw_fs type ext3 (rw,noatime)

Mijzelf wrote:Does diskmon run in busybox ash? AFAIK that uses 'internal' du, grep, ...
Not fully, the firmware uses busybox applet but also standard command :
Code: Select all
root@Acrab:/ # ls -al /usr/bin/du
-rwxr-xr-x  1 root root 100312 2012-10-01 20:30 /usr/bin/du
root@Acrab:/ # whereis grep
grep: /bin/grep /usr/bin/grep
root@Acrab:/ # ls -l /bin/grep /usr/bin/grep
lrwxrwxrwx  1 root root      7 2017-05-27 19:41 /bin/grep -> busybox
-rwxr-xr-x  1 root root 100420 2009-02-13 21:45 /usr/bin/grep
so we have both command types for grep and only the standard du command.
=> so need to update the script for using full path
Jocko
Site Admin - expert
 
Posts: 11367
Joined: Tue Apr 12, 2011 4:48 pm
Location: Orleans, France

Re: Hard Disk wake up very frecuently

Postby Mijzelf » Sun Sep 20, 2020 10:36 am

Did some research, apparently busybox shell only uses internal applets without searching the path when 'FEATURE_PREFER_APPLETS' was set at compile time.
On the ZyXEL NASses for which I wrote this script, it doesn't matter anyway, as the rootfilesystem is in ram. But indeed providing a full path (to the busybox applets) might prevent the disk from being waked up. (Unless the cache of the disk was flushed, so it has to read the disk anyway to find out that the applets are symlinks to busybox.)
Jocko wrote:
Code: Select all
root@Acrab:/ # mount
/dev/sda2 on / type ext3 (rw,relatime)                                        <=============== here

In that case executing the binary shouldn't dirty the inode, I think?
man mount wrote:relatime
Update inode access times relative to modify or change time. Access time is only updated if the previous access time was earlier than the current modify or change time. (Similar to noatime, but doesn't break mutt or other applications that need to know if a file has been read since the last time it was modified.)

On my laptop the filesystems are mounted relatime. I searched the /bin directory for some obscure command which was never used, and looked at the timestamps:
Code: Select all
[user@Anthraciet ~]$ stat /bin/msgen
  File: /bin/msgen
  Size: 23912        Blocks: 48         IO Block: 4096   regular file
Device: 801h/2049d   Inode: 2753749     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-11-15 14:00:55.000000000 +0100
Modify: 2018-11-15 14:00:55.000000000 +0100
Change: 2019-12-01 21:02:37.201100375 +0100
 Birth: 2019-12-01 21:02:37.201100375 +0100
So on this file the access timestamp is older than the change timestamp. It should change at execute, but only while access < change. And so it did:
Code: Select all
[user@Anthraciet ~]$ /bin/msgen
/bin/msgen: no input file given
Try '/bin/msgen --help' for more information.
[user@Anthraciet ~]$ stat /bin/msgen
  File: /bin/msgen
  Size: 23912        Blocks: 48         IO Block: 4096   regular file
Device: 801h/2049d   Inode: 2753749     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-09-20 12:12:13.483275025 +0200
Modify: 2018-11-15 14:00:55.000000000 +0100
Change: 2019-12-01 21:02:37.201100375 +0100
  Birth: 2019-12-01 21:02:37.201100375 +0100
# Wait one minute...
[user@Anthraciet ~]$ /bin/msgen
/bin/msgen: no input file given
Try '/bin/msgen --help' for more information.
[user@Anthraciet ~]$ stat /bin/msgen
  File: /bin/msgen
  Size: 23912        Blocks: 48         IO Block: 4096   regular file
Device: 801h/2049d   Inode: 2753749     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-09-20 12:12:13.483275025 +0200
Modify: 2018-11-15 14:00:55.000000000 +0100
Change: 2019-12-01 21:02:37.201100375 +0100
 Birth: 2019-12-01 21:02:37.201100375 +0100

As you can see only the first time the access time is changed, and not the second time. As relatime should do. With noatime it wouldn't have changed at all.
So why are your inodes dirtied? Are the executables touched in between? Or are the symlinks created each time on boot? In the latter case each first access after boot will dirty the inode.
Or maybe, is the 'change' time bogus/in future? Haven't tested it, but I can imagine that if 'change' or 'modified' is in future each access will change 'access'.
Mijzelf
 
Posts: 254
Joined: Wed Nov 21, 2012 9:12 am

Re: Hard Disk wake up very frecuently

Postby serram » Sun Sep 20, 2020 11:12 am

Here it is the log1 file of almost 12 hours of logging with diskmon in my Lacie cloudbox, with inactivity period set to 10 minutes and the LAN cable disconnected.
You do not have the required permissions to view the files attached to this post.
serram
Donator VIP
Donator VIP
 
Posts: 29
Joined: Thu Aug 27, 2020 7:21 pm

Re: Hard Disk wake up very frecuently

Postby Jocko » Sun Sep 20, 2020 11:42 am

Hi

Thank you for your post.

After examining the log, what I can say:
- you will have to repeat the check with an updated diskmon version (uses only busybox applet) and noatime option on the fw system partition. (I just made a test and that fixes some I/O)
- hostd updated hosts file every 2 hours xx:51. So that means at this time the disk was spin up
- Mainly I/O accesses have a cycle behaviour (30min): xx:06; xx:36;xx:21;xx:51 I think noflushd still uses a timeout value 30min. I detected a bug last friday: from the web-interface, the firmware fails to kill the running daemon and then noflushd does not use the new settings.
- between these cycles I/O (JBD+noflushd) I noticed nothing

To confirm you have this bug can you post
Code: Select all
 ps -ef|grep noflushd
Note: do not reboot the nas, as in this case noflushd uses the expected conf
Jocko
Site Admin - expert
 
Posts: 11367
Joined: Tue Apr 12, 2011 4:48 pm
Location: Orleans, France

Re: Hard Disk wake up very frecuently

Postby Mijzelf » Sun Sep 20, 2020 11:57 am

I see a lot of unexplained activity, like dirtying and flushing inodes which shouldn't have been dirtied in the first place.

Another thing:
Code: Select all
==== Sun Sep 20 06:06:08 CEST 2020 ====
[18074.686786] jbd2/sda5-8(785): WRITE block 530784 on sda5 (8 sectors)
[18074.686822] jbd2/sda5-8(785): WRITE block 530792 on sda5 (8 sectors)
[18074.686840] jbd2/sda5-8(785): WRITE block 530800 on sda5 (8 sectors)
[18074.686854] jbd2/sda5-8(785): WRITE block 530808 on sda5 (8 sectors)
[18074.686867] jbd2/sda5-8(785): WRITE block 530816 on sda5 (8 sectors)
[18074.686880] jbd2/sda5-8(785): WRITE block 530824 on sda5 (8 sectors)
[18080.406232] jbd2/sda5-8(785): WRITE block 530832 on sda5 (8 sectors)
[18080.526928] noflushd(1315): WRITE block 0 on sda5 (8 sectors)
[18080.526971] noflushd(1315): WRITE block 8 on sda5 (8 sectors)
[18080.526994] noflushd(1315): WRITE block 786856 on sda5 (8 sectors)
[18080.527020] noflushd(1315): WRITE block 786864 on sda5 (8 sectors)
[18080.527040] noflushd(1315): WRITE block 790608 on sda5 (8 sectors)
This repeats until 7:51. The journal blocks numbers are incrementing, as to be expected from a journal file. (I suppose it will wrap around if the end of the journal file is reached).
And then each time noflushd writes block 0, 8, 786856, 786864, 790608. I suppose 0 and 8 are header data, but it would be interesting which file(s) are that high block numbers.
If you have the tool debugfs, it can be found using
Code: Select all
debugfs -R "icheck 786856 786864 790608" /dev/sda5
Which will give the inode number(s), and
Code: Select all
debugfs -R "ncheck <inode-number>" /dev/sda5
which will give the (relative) filename.

It should also be possible to use debugfs to read the journal, but at the moment I don't know how.
Mijzelf
 
Posts: 254
Joined: Wed Nov 21, 2012 9:12 am

Re: Hard Disk wake up very frecuently

Postby serram » Sun Sep 20, 2020 12:05 pm

Hi Jocko

Thank you for your conclusions.
I've observed that the disk only remains still very few minutes, usually less than 10. So I think the hostd behavior (2 hours) wouldn't be a big problem for me.
Although the most desirable behavior would be that the disk remains still as long as possible to avoid future problems of head failure, etc.

To confirm you have this bug can you post
Code: Select all
 ps -ef|grep noflushd



This is the output of ps and grep comands:
Code: Select all
root@fvdwsl-base:/ # ps -ef|grep noflushd
root      1315     1  0 01:05 ?        00:00:08 /usr/sbin/noflushd -n 10 /dev/sda
root     24926 24906  0 13:51 pts/0    00:00:00 grep noflushd


I'll repeat the test with the updated diskmon version when you told me, thanks again.
Last edited by serram on Sun Sep 20, 2020 12:19 pm, edited 1 time in total.
serram
Donator VIP
Donator VIP
 
Posts: 29
Joined: Thu Aug 27, 2020 7:21 pm

Re: Hard Disk wake up very frecuently

Postby serram » Sun Sep 20, 2020 12:12 pm

Hi Mijzelf, this is the output of the debugfs tool comands:
Code: Select all
root@fvdwsl-base:/ # debugfs -R "icheck 786856 786864 790608" /dev/sda5
debugfs 1.43.4 (31-Jan-2017)
Block   Inode number
786856  <block not found>
786864  <block not found>
790608  <block not found>

root@fvdwsl-base:/ # debugfs -R "ncheck <inode-number>" /dev/sda5
debugfs 1.43.4 (31-Jan-2017)
<inode-number>: Bad inode - <inode-number>
serram
Donator VIP
Donator VIP
 
Posts: 29
Joined: Thu Aug 27, 2020 7:21 pm

Re: Hard Disk wake up very frecuently

Postby Jocko » Sun Sep 20, 2020 12:50 pm

From web, it seems you have to divide by 8 the block number (fs block size is 4096=8 x 512):
So run
Code: Select all
debugfs -R "icheck 98357 98358 98826" /dev/sda5
Jocko
Site Admin - expert
 
Posts: 11367
Joined: Tue Apr 12, 2011 4:48 pm
Location: Orleans, France

Re: Hard Disk wake up very frecuently

Postby serram » Sun Sep 20, 2020 1:08 pm

Here it is the output:
Code: Select all
root@fvdwsl-base:/ # debugfs -R "icheck 98357 98358 98826" /dev/sda5
debugfs 1.43.4 (31-Jan-2017)
Block   Inode number
98357   <block not found>
98358   <block not found>
98826   22428
serram
Donator VIP
Donator VIP
 
Posts: 29
Joined: Thu Aug 27, 2020 7:21 pm

Re: Hard Disk wake up very frecuently

Postby Jocko » Sun Sep 20, 2020 1:13 pm

and then run
Code: Select all
debugfs -R "ncheck 22428" /dev/sda5
Jocko
Site Admin - expert
 
Posts: 11367
Joined: Tue Apr 12, 2011 4:48 pm
Location: Orleans, France

PreviousNext

Return to Lacie cloudbox (white casing)

Who is online

Users browsing this forum: No registered users and 5 guests

cron