Bug 9177 - init.d mountall.sh LVM fix [PATCH PROVIDED]
: init.d mountall.sh LVM fix [PATCH PROVIDED]
Status: NEW
Product: Codex
Classification: Unclassified
Component: smgl
: stable grimoire
: Other other
: P2 normal
Assigned To: Grimoire Bug List
Depends on:
Blocks: 8198
  Show dependency treegraph
 
Reported: 2005-07-06 06:16 UTC by Eric Schabell
Modified: 2008-06-16 12:23 UTC (History)
4 users (show)

See Also:


Attachments
patch for mountall.sh (lvm fix) (443 bytes, patch)
2005-07-06 06:17 UTC, Eric Schabell
Details | Diff
updated patch for lvm in mountall.sh (1.65 KB, patch)
2005-11-05 02:18 UTC, Jeremy Blosser
Details | Diff
initial devel commit of init.d 2.2.0 (27.73 KB, application/octet-stream)
2006-02-27 02:09 UTC, Jeremy Blosser
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Schabell 2005-07-06 06:16:24 UTC
I have sorted out what is needed to boot and mount lvm partions correctly in 
the mountall.sh script, see patch attached. 
 
Making this a sub-bug of #8198.
Comment 1 Eric Schabell 2005-07-06 06:17:48 UTC
Created attachment 4369 [details]
patch for mountall.sh (lvm fix)
Comment 2 Andrew Stitt 2005-07-06 07:47:19 UTC
awesome thanks! handing to seth as he tends to take care of the init system.
Comment 3 Seth Woolley 2005-07-06 13:19:10 UTC
how safe is --ignorelockingfailure?

Other scripts we have are overzealous about protecting the system, such 
as that we make sure fsck passes before mounting partitions rw.  The 
locking, I figure is similar to fsck's dirty partition check, I would 
think and maybe we want to have a kernel command-line option instead 
like: forcemountonerror=yes that we can use to add the 
--ignorelockingfailure switch and to bypass the fsck process in general.
Comment 4 Eric Schabell 2005-07-07 00:19:37 UTC
All I remember about it was that if you don't use it, the script fails with 
another exit code than 0 (not sure which though). It was in the semi-howto I 
followed to get this to work. 
 
Without it you just get a warning message every time, but due to the 
accompanying exit code, it fails. I get an unbootable system as /var /home are 
LVM's here on my production box. 
 
PS. This means I will not be testing any sort of solutions you provide outside 
of the one I have here that works. I don't reboot this box unless absolutely 
necessary! ;-) 
 
Comment 5 Seth Woolley 2005-07-07 09:45:44 UTC
man 8 lvm common option excerpt:

       --ignorelockingfailure
              This lets you proceed with read-only metadata  operations  such
              as  lvchange  -ay  and  vgchange -ay even if the locking module
              fails.  One use for this is in a system init script if the lock
              directory is mounted read-only when the script runs.

I take it our /var/lock is read-only when the script runs.  Seems 
perfectly sensible to me.  I'll get this in soon.
Comment 6 Seth Woolley 2005-07-08 12:31:26 UTC
testing a patch in devel grimoire
Comment 7 Jeremy Blosser 2005-11-05 02:18:04 UTC
Created attachment 4432 [details]
updated patch for lvm in mountall.sh
Comment 8 Jeremy Blosser 2005-11-05 02:18:47 UTC
the current mountall.sh based on the previous patch doesn't actually work with
LVM.  the attached patch to the mountall.sh in test does the trick and is what
the new sourcemage box is using.  I'll apply this in a few days if no one
complains here.
Comment 9 Eric Schabell 2005-11-05 09:36:27 UTC
I don't understand your comments that the previous fix does not work with 
LVM??? I use it and it does work. Further, your patch is just changes outside 
of the actual LVM part of mountall.sh: 
 
<snip> 
if optional_executable /sbin/vgscan && optional_executable /sbin/vgchange ; 
then 
+    echo "Mounting proc file system..." 
+    mount /proc 
+    evaluate_retval || exit 1 
     echo -n "Scanning for and initializing all available LVM volume 
groups..." 
     /sbin/vgscan       --ignorelockingfailure  --mknodes  && 
     /sbin/vgchange -ay --ignorelockingfailure 
<snip> 
 
So I have no problems with it if you do not bork out the LVM stuff! ;-) 
 
P.S. Thanks for checking with the users of this part of mountall.sh though, I 
run production stuff on this box so it does not reboot very often. 
Comment 10 Jeremy Blosser 2005-11-05 12:01:29 UTC
The problem is the order of operations, not the LVM section itself.  The current
order used is:
1. mount root read-only
2. activate lvm
3. check all file systems
4. remount root rw
5. activate swap
6. mount -a

This fails here for a couple reasons:
a. lvm needs to write to /etc/lvm/archive, and can't when root is ro
b. lvm wants /proc mounted first to check /proc/partitions

This patch changes the order to:
1. mount root read-only
2. check root
3. remount root rw
4. mount /proc
5. activate lvm
6. check all non-root fs
7. mount swap
8. mount -a

I'm not sure how yours would be working with what's there unless there's ways to
make lvm not need a writeable /etc or use of /proc (quite possible, I'm a newbie
to lvm), but in any case these changes should be harmless.
Comment 11 Jeremy Blosser 2005-11-05 12:41:51 UTC
Ok after a bit more digging there is one place this could fail, if the system
has no procfs support.  It looks like it might be possible to build LVM without
needing /proc, though I haven't exactly found how yet.  We could probably just
do mount -a -t proc instead of mount /proc to fix this, or at least avoid an
error regardless of how they decided to use procfs.
Comment 12 Eric Schabell 2005-11-08 03:03:46 UTC
Well, I have a procfs and it works, hope your fix is sufficiently tested 
before hitting stable. I would really hate to have to go through rescues again 
if this borkes on my production box...  
Comment 13 Eric Schabell 2006-01-10 10:09:41 UTC
This stuff has hit my production box months ago... might wanna mark it done?
Over 7 months old, nice to get it outta my list! ;-)
Comment 14 Eric Schabell 2006-02-20 14:34:57 UTC
Ok, I will mark it fixed for me! 
Comment 15 Eric Schabell 2006-02-20 14:35:18 UTC
And verified by me! 
Comment 16 Jeremy Blosser 2006-02-20 14:57:01 UTC
It is not fixed, as my previous comments noted.  I haven't yet gone back to it
because there were some corner cases to work out and I wasn't going to apply
something half-done to something this critical.  LVM is still a primary concern
for the 0.9.7 ISO release series coming later this year, so it will get
addressed at that time if not before.
Comment 17 Jeremy Blosser 2006-02-27 01:50:25 UTC
Final fix for this is in devel at 75279, anyone interested who has a test system
PLEASE try this out and give feedback so we know nothing was missed when we move
it to test.
Comment 18 Jeremy Blosser 2006-02-27 01:52:06 UTC
Marking FIXED to note the fix is in the tree, shouldn't get VERIFIED until the
fix is in at least test.
Comment 19 Eric Schabell 2006-02-27 02:00:22 UTC
Could you mail met the devel version? I would like to see this before it 
migrates to my boxes. 
Comment 20 Jeremy Blosser 2006-02-27 02:09:57 UTC
Created attachment 4657 [details]
initial devel commit of init.d 2.2.0

I'll just attach it here in case others have the same request.	Don't try to
install it anywhere without the module-init-tools tarball attached to bug
10146.
Comment 21 Jeremy Blosser 2006-03-11 23:25:29 UTC
This has been integrated to test.
Comment 22 Jeremy Blosser 2006-04-19 14:42:51 UTC
Sounds like these changes don't necessarily work on initrd / lvm root systems. 
Looking into it...
Comment 23 Jeremy Blosser 2006-04-19 14:44:40 UTC
/proc mounting can go to mountroot.sh before the root fs is mounted ro.

The /etc rw error is that if /etc/lvm doesn't exist, and /etc isn't writeable
when vgscan runs, you get:

"Failed to create LVM2 system dir for metadata backups, config files and
internal cache.  Set environment variable LVM_SYSTEM_DIR to alternative location
or empty string."

I'm looking into the way around this.
Comment 24 David Kowis 2006-04-19 14:52:30 UTC
(In reply to comment #23)
> 
> I'm looking into the way around this.

Random useless thought: mount a tmpfs and then synchronize it back up and
destroy the tmpfs in the normal mount?
Probably not a useful solution...

Comment 25 Jeremy Blosser 2006-04-19 18:56:30 UTC
It looks like we can turn off the need for /etc on a per-command basis, but I
don't want to do that since it's the default configuration, and the docs
strongly recommend against turning it off.

However, even without that there are issues for things like static /dev. 
Basically we need to do some rather different ordering depending on what the
user has setup:

normal routine:

devices  : (udev|static)
mountroot: (proc)
           (raid)
           root ro
           fsck root
           root rw
modutils : (modules)
mountall : fsck
           mountall

lvm:
- needs /proc
- needs writeable /dev
- needs writeable /etc (in default config)
- may need modules

if using root lvm w/initrd you need:
- udev or /dev on initrd
- /etc on initrd
- no modules or modules on initrd

So basically if they're using lvm for / we can assume they have most of these
things on the initrd and just scan and activate the volumes before we try to
access root.  If they *aren't* using an initrd we should wait to do anything
with lvm until root is rw and modutils has run.

Given that running the lvm commands "early" doesn't actually break anything, I'm
considering spinning lvm off to its own init script and making both mountroot
and mountall try to run it (via WANTS), with some sanity checks to avoid running
it needlessly or twice.  I'm not sure if this is the best approach but we
definitely have some cases where we want it run from mountroot and some where we
want it run from mountall.

Anyone have comments?
Comment 26 Sergey Lipnevich 2006-04-19 21:50:42 UTC
As I said on the list, I never observed LVM commands that I moved, to touch
/etc. I only used (successfully) LVM 2. With the mountroot.sh the way it is at
this moment (with LVM calls), my (initrd-booted) system starts without root FS
mounted. You can take a look at relevant scripts in linux-initrd spell, profile
mk_initrd.lvm.
Comment 27 Jeremy Blosser 2006-04-19 23:19:28 UTC
The /etc behaviour is documented in the lvm.conf man page, see the 'archive'
section and the 'archive_dir' and 'backup_dir' options.  I'm assuming they
default to writing this to /etc because it'll be on the root fs and available at
the LVM init stage more often than /var would be.

I think perhaps it only really errors if it can't create the dir at all and not
if the dir is there but currently unwriteable, but I haven't been able to
completely confirm that yet since my test boxes have static dev as well.
Comment 28 Seth Woolley 2006-07-08 17:29:32 UTC
This bug existed in the stable-0.3 release
Comment 29 Jeremy Blosser 2006-07-11 17:09:05 UTC
The issues are now in stable-rc and a primary part of the 0.9.7 iso target, so
they're mustfix.  Note again, the current version of init.d in test works for
people with lvm, not not lvm initrds.  The version in devel works for people
with lvm initrds, but not people with lvm without initrds.  The stated target
for 0.9.7 is installer support for lvm not including initrds, but since the
existing stable works with an initrd we shouldn't regress the regular grimoire
support.
Comment 30 David Brown 2007-01-07 17:54:18 UTC
Okay I just setup root lvm and the only thing I had to add to mountroot.sh was this:

=================
--- mountroot.sh.old    2007-01-07 15:50:08.000000000 -0800
+++ mountroot.sh        2007-01-07 15:49:54.000000000 -0800
@@ -51,6 +51,12 @@
     raidstart  --all
   fi
 
+  if optional_executable /sbin/dmsetup
+  then
+    echo "Making device mapper nodes..."
+    /sbin/dmsetup mknodes
+  fi
+
   echo "Mounting root file system read only..."
   mount   -n  -o  remount,ro  /
   evaluate_retval || exit 1
==================

as long as the device entries for root in /etc/fstab are in the /dev/mapper dir
then this will create the nodes needed if you want the /dev/<volgroup>/<volume>
syntax we'll have to add the other two commands needed

I wouldn't expect running these commands twice would create ill effects.
Comment 31 Jeremy Blosser 2007-01-18 17:16:10 UTC
I think that starts to work, it needs a few more things:

- It works for udev because you get /proc from init.d/*/devices udev_start(). 
It won't work for static because it won't have /proc.  We should have /proc get
mounted in devices start() so everything has it (and it can come out of
mountall.sh in that case).

- We should add the vgscan and vgchange calls in there, yeah.

- On my local test setups I only try to run the lvm commands in mountroot if I
see /dev is writeable (if [ -w /dev ]), and then have all of them in a
startlvm() function that also sets an LVM_DONE variable.  This means it only
runs once, don't know if that's really necessary or not.

This would still leave us potentially running these commands both in mountroot
and in mountall, since non-initrd setups would possibly need modutils before
being able to run these commands.  This isn't ideal but probably gets us past
"must fix" for the release.
Comment 32 Jeremy Blosser 2007-01-18 22:55:02 UTC
There's a fix based on this in test now, git commit
d799c9f7d0863829c5fb7c701be785216314090c.

Sergey and erics, if you're still reading and want to try this, I'd appreciate
it.  This last bug we've been trying to deal with is one that would affect your
installs so I'd like absolute confirmation it works for you this way.
Comment 33 Eric Schabell 2007-01-19 06:48:53 UTC
jeremy,

Thanks for asking, I will be away from the box in question the coming week, will
try to take a look next week and get back to you.

erics
Comment 34 Jeremy Blosser 2007-01-19 22:19:24 UTC
It will probably be in stable by then, but we only know of three systems in the
wild that have LVM roots, and one of those reports success so far.  Just be
cautious upgrading and let us know of any issues.
Comment 35 Jeremy Blosser 2007-01-20 12:19:18 UTC
Do not mark this fixed after integrating, there's more to do on the bug.  But
this much will let us clear 'must fix'.
Comment 36 Jeremy Blosser 2007-01-21 02:20:50 UTC
I've integrated this fix and a few it depended on to stable-rc-0.6.
Comment 37 Eric Schabell 2007-02-19 04:14:15 UTC
Jermey,

> Sergey and erics, if you're still reading and want to try this, I'd appreciate
> it.  This last bug we've been trying to deal with is one that would affect your
> installs so I'd like absolute confirmation it works for you this way.
>
I have upgraded with the last stable 0.6 and 0.7, but a reboot of the machine is
not possible for awhile anyway. You will hear from me when it happens either way.
Comment 38 Jeremy Blosser 2007-04-01 00:10:34 UTC
reassign to sm-grimoire-bugs
Comment 39 Eric Schabell 2007-05-08 03:44:22 UTC
For me this is fixed... you can close it and mark varified. Thanks.
Comment 40 Eric Sandall 2007-06-14 14:49:14 UTC
What's left on this bug now that the original bug reported by erics is verified
as fixed?
Comment 41 Arwed v. Merkatz 2008-06-16 12:23:34 UTC
This now applies to stable-rc (0.22).