Ubuntu 9.04 gives black screen after splash, remote login shows hung/unkillable gdm instance

Asked by FactTech

This is an issue that was happening under a fresh Ubuntu 8.10 install, and it has continued to happen since a recent upgrade to 9.04. I'm opening this bug to help diagnose this issue and possibly develop both a workaround and a fix.

The machine in question is an eMachines T2824, and I'm using the built-in graphics system that's part of the motherboard. 'lspci' reports it is an "Intel Corporation 8245G/GL[Brookdale-G]/GE Chipset Integrated Graphics Device (rev 01)", with subsystem "Intel Corporation Device 5352".

The problem that I am seeing is that, on a fair percentage of boots (around 40-50%), the machine appears to hang with a black screen immediately after the splash screen progress bar fills up. This happens before the background color is changed, the mouse cursor appears, or gdm comes up. The machine appears completely unresponsive at this point, and I cannot access alternate terminals via CTRL-ALT-# as might be expected.

In order to help diagnose the problem, I have set up ssh-server so that I can remotely log into the machine while it is "hung". Remote login works perfectly normally when the screen is black, and inspecting the output of 'ps aux' shows two instances of /usr/sbin/gdm being executed.

Attempts to restart gdm with 'sudo /etc/init.d/gdm restart' do not work. At least one instance of gdm seems to be unresponsive to commands:
 - 'sudo /etc/init.d/gdm stop'
 - 'sudo kill <PID of instance>'
 - 'sudo kill -9 <PID of instance>'

In addition, remotely executing 'sudo shutdown -r now' fails to shut down the machine, and it must be powered off completely to try to restart.

Some items to note:

1) The hang seemed to be avoided every time I accessed the GRUB menu and overrode the 'quiet' and 'splash' options from the default boot menu entry, then booted. However, making the change permanent by adjusting /boot/grub/menu.lst has still resulted in at least one hang on a cold start.

2) The hang also seemed to be avoided every time I accessed an alternate terminal during the splash screen. In these cases, the display switches to gdm when it is initialized, as would be expected.

3) I'm not positive that gdm is the ultimate culprit here, but I am filing this bug against gdm because that is the program that appears unresponsive.

4) This issue is not limited to cold boots. I have seen the same thing happen on warm restarts as well, and can be reliably reproduced by executing 'sudo /etc/init.d/gdm restart' a few times successively.

I'm not an expert, but I can certainly follow directions to gather more information about this problem. Please let me know what additional information you would like to help diagnose this issue.

gdm version = 2.20.10-0ubuntu2
lsb_release -rd = Ubuntu 9.04

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu gdm Edit question
Assignee:
No assignee Edit question
Solved by:
FactTech
Solved:
Last query:
Last reply:
Revision history for this message
Sebastien Bacher (seb128) said :
#1

thank you for your bug report, what videocard and driver do you use? do you get the issue using an another login software?

Revision history for this message
FactTech (launchpad-facttechnologies) said :
#2

As described, this is the default graphics device included on the motherboard of an eMachines T2824. According to the specifications at http://e4me.com/support/product_support.html?cat=Desktops&subcat=T%20Series&model=T2824 , this is "Intel® Extreme Graphics 3D / 64MB Shared memory".

The driver in use appears to be 'i915', based on the output from lsmod, so...

Graphics card (per manufacturer): Intel® Extreme Graphics 3D / 64MB Shared memory
Graphics card (per lspci): Intel Corporation 8245G/GL[Brookdale-G]/GE Chipset Integrated Graphics Device (rev 01)
Driver in use: i915

I have not tried any other login packages. Can you point me to some instructions on installing and testing another package so I can give it a try?

Revision history for this message
Sebastien Bacher (seb128) said :
#3

changing the bug to a question that's rather that than a clear gdm bug

Revision history for this message
FactTech (launchpad-facttechnologies) said :
#4

Although Sebastian has reclassified this as a question instead of a bug report, he did not provide any additional information on how I might resolve it and/or confirm that it is a gdm bug.

I'm certainly interested in getting to the bottom of this issue, so any advice anyone has to offer in diagnosing it is welcome.

Revision history for this message
aswanson (bigfoot-08) said :
#5

I can't say that I have any advice, but I'm in the same boat you are. I'm having the same problem with a Pavilion a305w, which also has integrated Intel Extreme Graphics with 64MB shared memory. My chipset is 845GL. I posted a thread about it on UbuntuForums.org, so maybe we might be able to get some answers. Keep me posted if you find out anything or work out a solution.

http://ubuntuforums.org/showthread.php?t=1149879

Revision history for this message
FactTech (launchpad-facttechnologies) said :
#6

Aswanson, thanks for the note. I've been looking for more information, but all I've really discovered is that the "intel" driver (which is used for Intel on-board video chipsets) is currently being heavily restructured, and there are a lot of known problems.

My issue seems to be specifically related to mode/resolution switching -- it will run for a long time so long as I don't switch to a CTRL-ALT terminal or use any programs that change the resolution of the display. Does yours exhibit the same behavior?

Revision history for this message
Michael Mevers (mmevers) said :
#7

Following the original post, I remoted in and found the exact same behavior using an integrated ATI chipset. I have been using Ubuntu 8.10 with the open source driver set and that has been stable. Only since I did a clean install of 9.04 has this problem started. The odd thing is this only started after several additional packages and updates were installed and the system was rebooted but the end result matches the description above. Has anyone been able to resolve this or work around it? I am going back to 8.10 since it was stable but I would like to use 9.04 since it does resolve some SATA and memory issues on my Asus MB.

Revision history for this message
raywood (ray-woodcock) said :
#8

No answer yet, but a parallel discussion at http://ubuntuforums.org/showthread.php?p=7248465#post7248465

Revision history for this message
aswanson (bigfoot-08) said :
#9

I haven't tried switching terminals while everything seems to be working, I'll have to check on that. But for me it doesn't seem to be connected to changing the resolution at all, it seems random. I can reboot fine one time, and without running anything, just shut down and turn it on again, and all I get is a black screen again. I'll let you know if anything happens with changing terminals.

Revision history for this message
raywood (ray-woodcock) said :
#10

We may have differing versions of the same problem; not sure.

This morning, I disconnected the nvidia graphics card and plugged the monitor into the onboard graphics connector. Made no difference.

It occurred to me that I should remove the nvidia drivers. (See http://randomtechoutburst.blogspot.com/2009/03/hand-cranked-nvidia-drivers-on-kubuntu.html) I rebooted into recovery mode, went to the root shell prompt, and typed "apt-get remove linux-restricted-modules-common linux-restricted-modules-generic." (Note that this is a list of two separate things to remove, separated by a space.) This command actually resulted in four packages being removed. The process of removing gave me "Segmentation fault," which did not sound good. But when I ran the same command again, apt-get indicated there were no additional packages to remove. Apt-get also suggested running "apt-get autoremove," so I did that too. I then typed "exit" and, back at the recovery menu, I ran dpkg and fsck. Then I tried resume, to resume a normal boot. This did not work. I punched the reset button and tried a normal boot. No luck there either.

Back in recovery mode, at the prompt, I typed "dkms status" (see https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-180/+bug/282214). This seemed to say that the nvidia driver was still installed. Another post in that thread suggested "apt-get remove --purge nvidia*" (no need for "sudo" at the root prompt). This removed nine packages. There were a couple of error messages about directories not being removed because they weren't empty, so I ran it again. No additional impact. I tried "dkms status" again. This time, no message. I exited and tried a recovery boot.

Success! This gave me back my Ubuntu desktop. I shut down the machine, reinstalled the nvidia graphics card, reconnected the monitor to it. Weird experience there: I hadn't shut off the main power switch on the power supply, and when I plugged in the nvidia card, the machine started itself up! Hadn't had that happen before. I had previously noticed that the white print on black background was not as bright through the nvidia card as it had been through system graphics, and that was still true.

Normal boot worked fine. Back in Ubuntu, System > Administration > Hardware Drivers did not produce a list of available nVidia drivers. It appeared that I had *really* wiped them from the system. Some people were saying that they had the same empty list in Hardware Drivers, but their high-quality desktop effects were working OK, so apparently the drivers were installed and just not listed. (See http://ubuntuforums.org/showthread.php?t=775304.) But when I went to System > Preferences > CompizConfig Settings Manager, everything was blank there too. In System > Administration > Synaptic, I saw that simple-ccsm was installed, so I went to System > Preferences > Appearance > Visual Effects > Custom, but this gave me "Desktop effects could not be enabled." Synaptic also said that compizconfig-settings-manager was installed. I marked it for reinstallation and applied, but then I still got "Desktop effects could not be enabled" in Visual Effects > Custom. I also got the same thing when I tried "Extra" instead of "Custom" in Visual Effects.

In Synaptic, there didn't seem to be any nvidia drivers installed. I marked all that contained a reference to "180" (that being the driver version I had been using successfully before) and installed those. But that wasn't it; System > Administration > Hardware Drivers still showed no drivers, and I still got "Desktop effects could not be enabled." When I typed "SKIP_CHECKS=yes compiz" into Terminal, it seemed I had several problems that I was not sure how to resolve: "No whitelisted driver found" and "Software rasterizer detected" and "can not add gnomecompat" and nVidia "not present" and Xgl "not present." Apparently the "skip checks" command screwed up Terminal, because it didn't return me to a prompt; I had to kill it.

In the mother of all threads on this issue, someone said, "The software rasterizer is just a general hint, that the graphics chip is not properly installed and/or set up." See http://www.uluga.ubuntuforums.org/showthread.php?5=799070&page=61. I also saw some posts indicating that the problems affecting at least some users would not be fixed until the next version of Ubuntu, in October 2009.

It appeared, here, that the appropriate steps might differ depending on whether you were running i386 or (as in my case) amd64. (See http://ubuntuforums.org/showthread.php?t=1130582. The steps advised in that thread were fairly complex. Since it was not directly addressing my precise situation, I did not try them.)

In a new Terminal session, I tried "gconftool-2 --set --type=bool /apps/metacity/general/compositing_manager false" because someone said that helped them, but now the system was more or less unresponsive. I restarted and tried that command again. This time, it seemed to run, at least in the sense that it took me back to a prompt without any message. I then tried "compiz --replace" but this noted that xgl was still not present and "No whitelisted driver found" and therefore "aborting and using fallback: /usr/bin/metacity." This was all very reminiscent of the problem in http://ubuntuforums.org/showthread.php?t=609295, but there they concluded they didn't have the right kind of video card, whereas mine had been working fine.

Another possibility was to add another repository for drivers. See http://webupd8.blogspot.com/2009/05/graphic-video-drivers-ubuntu-repository.html. But now I found that other aspects of my system weren't working. For one thing, the top bar was not appearing on various windows, so I could not grab and move them. I also could not select different programs from the taskbar. So having started to follow those instructions, I found that I could not get back to the Software Sources window to continue. I restarted the system. This time, I was able to add the suggested repositories. I also made sure I had checked the basic canonical.com repository. But now I got a "no public key" error. It seems I had not copied the entire public key (including its header) as advised in that webpage. I tried again. This time it worked. But now I saw, in one of the comments, that this was going to give me highly experimental stuff, and if I wanted stable current drivers I should instead use https://launchpad.net/%7Eubuntu-x-swat/+archive/x-updates. I had to click and unclick one of the Third-Party Software entries to get the Reload option. But evidently I misunderstood the purpose of this exercise. When I was done, Synaptic did not show any newer nVidia drivers. Hardware Drivers did show that I now had a free nvidia driver that was currently in use (as distinct from nVidia's own proprietary drivers), but I was still not able to get custom effects working in Appearance.

Having spent enough time to reinstall the system twice, I gave up and did exactly that.

Revision history for this message
FactTech (launchpad-facttechnologies) said :
#11

I haven't experienced a lockup on start for over a week. The critical difference seemed to be an update to the intel video driver I got from jaunty-proposed.

I'm still experiencing a black screen hang frequently (but unpredictably) whenever I switch to a console using CTRL-ALT-<#> or issue 'sudo /etc/init.d/gdm restart' from a console. However, since this question was originally about the hang on startup, I am closing it.