Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colormgr get-devices is empty when two identical monitor models are plugged in #164

Open
voidptr127 opened this issue Jan 24, 2024 · 20 comments

Comments

@voidptr127
Copy link

System: Fedora 39 (Workstation), GNOME: 45.3, Kernel 6.6.12-200

I can't assign ICC profiles to my monitors because colormgr get-devices returns nothing when two identical monitor models are plugged in. This also affects the gnome-color-manager where neither monitor is shown when both are plugged in.

First, I checked system logs and got the following:

$ sudo journalctl -xe | grep colord
[...] gnome-shell[2099]: Failed to create colord device for 'xrandr-Acer Technologies-XF270HU-T78EE0048521': device id 'xrandr-Acer Technologies-XF270HU-T78EE0048521' already exists

Next, I unplugged both displayport cables, then only plugged one monitor back in. In this case, everything works as expected:

$ sudo colormgr get-devices 
Object Path:   /org/freedesktop/ColorManager/devices/xrandr_Acer_Technologies_XF270HU_T78EE0048521_xxxxx_1234
Owner:         xxxxx
Created:       January 24 2024, 03:08:54 PM
Modified:      January 24 2024, 03:08:54 PM
Type:          display
Enabled:       Yes
Embedded:      No
Model:         XF270HU
Vendor:        Acer
Serial:        T78EE0048521
Scope:         temp
Colorspace:    rgb
Device ID:     xrandr-Acer Technologies-XF270HU-T78EE0048521
Profile 1:     icc-f02b98744a0e3b3abfdd767bf9bbc8ab
               /var/lib/colord/icc/Acer XF270HUA-center.icm
Metadata:      OutputEdidMd5=1a7e20b4e5f0fbc8d83026b0fc75d8a4
Metadata:      OutputPriority=primary
Metadata:      XRANDR_name=DP-2
Metadata:      OwnerCmdline=/usr/bin/gnome-shell

Note that at this stage, I could also assign the ICC profile. However, it only works for the monitor that was plugged in. As soon as I plug in the second monitor, colormgr get-devices returns nothing and the default ICC profile for the other monitor is used. I also compared both outputs of colormgr get-devices. Both return exactly the same model and serial number.

Here is two times the output from colord with the --verbose flag set:

The first output shows what colord prints when I did stop colord and unplugged both monitors. Then I hit enter (blindly) to start colord, and plugged both monitors in, one after another. The second output is printed when I killed the previous process and started it afterwards while both monitors were already plugged in.

Particularly interesting might be these lines from the second file:

sudo -u colord /usr/libexec/colord --verbose
[...]
16:26:01	display card1-DP-2 has ID 'xrandr-Acer-XF270HU-T78EE0048521' from MD5 1a7e20b4e5f0fbc8d83026b0fc75d8a4
16:26:01	display card1-DP-3 has ID 'xrandr-Acer-XF270HU-T78EE0048521' from MD5 5516e0495119e86738e01422cfb835aa
[...]

I am not entirely sure about the expected behavior, but I suspect that the names should differ to avoid the confusion.

@hughsie
Copy link
Owner

hughsie commented Jan 24, 2024

The monitors are supposed to have different serial numbers. I'm afraid you might have to dig into the code to find a workaround we can apply upstream.

@voidptr127
Copy link
Author

I checked the serial numbers on the back of my monitors. It turns out that the serial number is truncated. It uses the first eight letters of the serial number: T78EE004 ... and the last 4 digits "8521" which apparently match. The other 8 hex digits in-between are entirely different.

Do you know why there are different md5 hashes then? When I cat /sys/class/drm/card1-DP-2/device/card1-DP-2/edid I see the truncated serial number with non-defined binary output before and after. So I guess the reading comes from there.

@voidptr127
Copy link
Author

voidptr127 commented Jan 24, 2024

Found my missing serial number digits. They are not encoded in hexadecimal but in decimal. However, they match the numbers on the back of my monitors:

# cat /sys/class/drm/card1-DP-2/edid | edid-decode | grep -E 'Display Product|Serial'
    Serial Number: 1422285011
    Display Product Name: 'XF270HU'
    Display Product Serial Number: 'T78EE0048521'
# cat /sys/class/drm/card1-DP-3/edid | edid-decode | grep -E 'Display Product|Serial'
    Serial Number: 1422285039
    Display Product Name: 'XF270HU'
    Display Product Serial Number: 'T78EE0048521'

It appears to be the only difference between both edid files when I do a diff (the first part is a hex dump displayed by edid-decode.)

# diff <(cat /sys/class/drm/card1-DP-2/edid | edid-decode) <(cat /sys/class/drm/card1-DP-3/edid | edid-decode)
[...]
<     Serial Number: 1422285011
---
>     Serial Number: 1422285039
[...]

I don't know about the internals of the edid file format. It seems that both serial numbers need to be concatenated when creating the device id. Any other monitor examples where this is similar? I guess one could fallback to the current scheme if the "Serial Number" field does not exist.

@voidptr127
Copy link
Author

voidptr127 commented Jan 24, 2024

I checked the EDID file format with the information on Wikipedia and it turns out that the missing numbers are supposed to be the "real" serial numbers. They can be directly read from the edid file at bytes 12-15 (32 bits, little endian, index starts at 0). The other "Display Product Serial Number" seems to be an "display/monitor descriptor" (max 18 bytes), which is not necessarily unique.

I am not sure which source for the serial number colord uses, as I could not directly find a way to get this information with xrandr. However, I am pretty sure that this is a bug how the name of the device-id is constructed. I.e., the "wrong" serial number is used, or at least it is not complete like in my case.

The check implemented in file src/cd-main.c beginning from line 2214 in cd_main_check_duplicate_edids is somewhat superfluous then. Because edid files encode the serial number and serial numbers are always different, the edid files are also always different. Thus, the hashes should never be the same. I.e., this check will always return false (if there is nothing wrong with the system or monitor). If this check is there to deal with "bogus systems / monitors" then forget about what I've just written.

What do you think?

@voidptr127
Copy link
Author

Found the problem in lib/colord/cd-edid.c

First hint at ll. 545ff:

/* maybe there isn't a ASCII serial number descriptor, so use this instead */
serial = (guint32) data[CD_EDID_OFFSET_SERIAL+0];
serial += (guint32) data[CD_EDID_OFFSET_SERIAL+1] * 0x100;
serial += (guint32) data[CD_EDID_OFFSET_SERIAL+2] * 0x10000;
serial += (guint32) data[CD_EDID_OFFSET_SERIAL+3] * 0x1000000;
if (serial > 0)
	priv->serial_number = g_strdup_printf ("%" G_GUINT32_FORMAT, serial);

This indicates that it is assumed that the serial number is the same as the display product serial number, but they are not.

After line 603, the serial number is then overwritten with the display product serial number:

[...]
} else if (data[i+3] == CD_DESCRIPTOR_DISPLAY_PRODUCT_SERIAL_NUMBER) {
	tmp = cd_edid_parse_string (&data[i+5]);
	if (tmp != NULL) {
		g_free (priv->serial_number);
		priv->serial_number = tmp;
	}
}
[...]

Which is then used in src/cd-main.c to create the device id (ll. 2199ff):

static gchar *
cd_main_get_display_id (CdEdid *edid)
{
	GString *device_id;

	device_id = g_string_new ("xrandr");
	if (cd_edid_get_vendor_name (edid) != NULL)
		g_string_append_printf (device_id, "-%s", cd_edid_get_vendor_name (edid));
	if (cd_edid_get_monitor_name (edid) != NULL)
		g_string_append_printf (device_id, "-%s", cd_edid_get_monitor_name (edid));
	if (cd_edid_get_serial_number (edid) != NULL)
		g_string_append_printf (device_id, "-%s", cd_edid_get_serial_number (edid));
	return g_string_free (device_id, FALSE);
}

Which is not correct, in my opinion.

My suggestion would be to keep both separate and change the construction of the device id.

I.e., in lib/colord/cd-edid.c I'd suggest to add another field to the CdEdidPrivate struct (ll. 45ff):

typedef struct
{
	CdColorYxy		*red;
	CdColorYxy		*green;
	CdColorYxy		*blue;
	CdColorYxy		*white;
	gchar			*checksum;
	gchar			*eisa_id;
	gchar			*monitor_name;
	gchar			*pnp_id;
	gchar			*serial_number;
	/* the line below is added */
	gchar			*display_product_serial_number;
	gchar			*vendor_name;
	gdouble			 gamma;
	guint			 height;
	guint			 width;
} CdEdidPrivate;

Then assign the display product serial number to that separate field at ll. 603ff instead of replacing the serial number:

} else if (data[i+3] == CD_DESCRIPTOR_DISPLAY_PRODUCT_SERIAL_NUMBER) {
	priv->display_product_serial_number = cd_edid_parse_string (&data[i+5]);
	/*  
	// delete these commented lines 
	if (tmp != NULL) {
		g_free (priv->serial_number);
		priv->serial_number = tmp;
	}
	*/
}

Make sure to update cd_edid_reset and cd_edid_finalize as well (ll. 396 and ll. 643)

void
cd_edid_reset (CdEdid *edid)
{
	CdEdidPrivate *priv = GET_PRIVATE (edid);

	g_return_if_fail (CD_IS_EDID (edid));

	/* free old data */
	g_free (priv->monitor_name);
	g_free (priv->vendor_name);
	g_free (priv->serial_number);
	/* add the line below */
	g_free (priv->display_product_serial_number);
	g_free (priv->eisa_id);
	g_free (priv->checksum);

	/* do not deallocate, just blank */
	priv->pnp_id[0] = '\0';

	/* set to default values */
	priv->monitor_name = NULL;
	priv->vendor_name = NULL;
	priv->serial_number = NULL;
	/* add the line below */
	priv->display_product_serial_number = NULL;
	priv->eisa_id = NULL;
	priv->checksum = NULL;
	priv->width = 0;
	priv->height = 0;
	priv->gamma = 0.0f;
}

static void
cd_edid_finalize (GObject *object)
{
	CdEdid *edid = CD_EDID (object);
	CdEdidPrivate *priv = GET_PRIVATE (edid);

	g_free (priv->monitor_name);
	g_free (priv->vendor_name);
	g_free (priv->serial_number);
	/* add this line */
	g_free (priv->display_product_serial_number);
	g_free (priv->eisa_id);
	g_free (priv->checksum);
	g_free (priv->pnp_id);
	cd_color_yxy_free (priv->white);
	cd_color_yxy_free (priv->red);
	cd_color_yxy_free (priv->green);
	cd_color_yxy_free (priv->blue);

	G_OBJECT_CLASS (cd_edid_parent_class)->finalize (object);
}

Add a new function to access it:

const gchar *
cd_edid_get_display_product_serial_number (CdEdid *edid)
{
	CdEdidPrivate *priv = GET_PRIVATE (edid);
	g_return_val_if_fail (CD_IS_EDID (edid), NULL);
	return priv->display_product_serial_number;
}

Update cd_main_get_display_id in src/cd-main.c at ll. 2199ff:

static gchar *
cd_main_get_display_id (CdEdid *edid)
{
	GString *device_id;

	device_id = g_string_new ("xrandr");
	if (cd_edid_get_vendor_name (edid) != NULL)
		g_string_append_printf (device_id, "-%s", cd_edid_get_vendor_name (edid));
	if (cd_edid_get_monitor_name (edid) != NULL)
		g_string_append_printf (device_id, "-%s", cd_edid_get_monitor_name (edid));
	if (cd_edid_get_display_product_serial_number (edid) != NULL)
		g_string_append_printf (device_id, "-%s", cd_edid_get_display_product_serial_number (edid));
	if (cd_edid_get_serial_number (edid) != NULL)
		g_string_append_printf (device_id, "-%s", cd_edid_get_serial_number (edid));
	
	return g_string_free (device_id, FALSE);
}

However, I am not sure how the new naming scheme would affect existing installations.

@hughsie
Copy link
Owner

hughsie commented Jan 25, 2024

how the new naming scheme would affect existing installations

I think it would break all the profile mappings! :)

I think what's supposed to happen is that the CD_DESCRIPTOR_DISPLAY_PRODUCT_SERIAL_NUMBER is only used when the serial "number" contains non-numeric chars. Maybe you can add a quirk for that specific EDID model so that the CD_DESCRIPTOR_DISPLAY_PRODUCT_SERIAL_NUMBER gets ignored completely?

@voidptr127
Copy link
Author

I am really struggling to get this sorted. First, I implemented the changes I proposed, but it turns out that changing this only affects the debug message and not anything else in the code. Hence, commenting out the code that overwrites the serial number with the display product serial number did not change anything either.

Next thing I did was setting priv->use_xrandr_mode always to be TRUE in src/cd-main.c. This seemed to work at first, as two monitors were shown in the gnome-color-manager. However, somehow it still created the old device id and it was using it along the xrandr device id simultaneously. Although I was seeing 2 devices in the gnome color manager, changing profiles was only possible for 1 of 2 monitors. Changing profile for the other monitor just did nothing, although it displayed that it changed it.

I am tempted to give up and accept that I can't have my color-calibrated ICC profiles with colord installed.

All I want is to somehow overwrite the default serial number. Where is the serial number data read from? It's not read from the /sys/class/drm/*/edid files from what I can tell. I can't find any xrandr or other library calls to read data regarding the display device, although it has XRANDR_name as a property set. All I see in the code are SQLite queries and DBUS functions. But how does it find the serial number property or any other device property to install it there in the first place? Is there another DBUS instance where it reads that data from?

@hughsie
Copy link
Owner

hughsie commented Jan 25, 2024

Where is the serial number data read from

The desktop client reads the EDID and creates the device with colord using DBus. You'll need to install the library before it'll be used by the session.

@voidptr127
Copy link
Author

Thank you for the reply. Can you specify what you mean by desktop client? I see that lib/colord/cd-client.c provides the cd_client_create_device function, but I can't find what calls the function. Please excuse the spam on that topic.

Do you mean that I would need to fully overwrite the default installation of colord with a custom one? I suppose killing colord and starting the custom compiled colord executable is not enough then?

@hughsie
Copy link
Owner

hughsie commented Jan 25, 2024

Can you specify what you mean by desktop client

Well, in GNOME it's gnome-settings-daemon for instance.

Do you mean that I would need to fully overwrite the default installation of colord with a custom one

At least the colord shared library.

@voidptr127
Copy link
Author

I tried an installation yesterday without success. I am not entirely sure whether meson install copied the files to the wrong location for Fedora or whether GNOME performs some static linking that I am not aware of. But I did no further investigation by compiling GNOME or searching for a SPEC file to verify the locations.

However, your comment got me thinking that maybe GNOME itself is the issue here. I installed KDE today and both displays showed up in the settings out of the box. Although I could not assign profiles via the GUI, I was able to assign them via colormgr. Good enough, I guess.

KDE uses a different naming scheme that includes the connecting monitor port at the end of the name. This seems to be similar to the XRANDR_name fallback in colord.

I will keep using KDE, because it solved my color profile problem. So for me, the case is closed.

I'd still suggest to enhance colord by exposing the "other EDID serial number" somehow. Probably best to do so via a separate function in the lib/colord/cd-edid.c file and keeping the behavior of the currently returned (display product) serial number (priv->serial) as it is for backwards compatibilty. Probably a hint to the GNOME project is also not a bad idea that they are assuming that the display names should be unique, but it might not always be the case.

Alternatively (?) or additionally, one might extend the "duplicate edid" check with a "duplicate device id" check to change usage of device ids in the same manner. I.e., switch to xrandr device id naming scheme if two device ids will be the same. I think that using the xrandr naming scheme, should have actually solved my problem, as this is what I can observe when using KDE.

Thanks for your help! You can close the issue if you want or keep it open as a reference.

@gahLAHAD
Copy link

gahLAHAD commented May 4, 2024

I'm having the same issue on GNOME.
Is there any "solution" other than changing to KDE or dig the code myself?
For example, is it possible to change the device id manually before plugging in the other monitor?
Just a thought. Cheers!

Repository owner deleted a comment from barolo Jul 27, 2024
@gahLAHAD
Copy link

Update: I'm still having this issue and I couldn't find any solution for it. Since this issue is causing the night light feature from GNOME to not be applied on my two identical external monitors, I was practically forced to move to KDE or get my eyes burnt while using my PC. I also tried using X11 + Redshift but X11 is EXTREMELY laggy and unusable for me nowadays.
I just wish colord had another method to identify the two identical monitors, because this behavior doesn't make any sense. I hope this is fixed in the future. Cheers.

@hughsie
Copy link
Owner

hughsie commented Aug 31, 2024

Do you pay for a Red Hat subscription? If so I could look at this in "work time" so to speak.

@gahLAHAD
Copy link

gahLAHAD commented Sep 2, 2024

Do you pay for a Red Hat subscription? If so I could look at this in "work time" so to speak.

Nope, I don't think I do.
How much is that?

@hughsie
Copy link
Owner

hughsie commented Sep 2, 2024

How much is that?

No idea, sorry.

@pat-flew
Copy link

pat-flew commented Sep 17, 2024

To throw in another data point here, I have what seems like a similar situation but caused by the same monitor being attached to both integrated and discrete graphics outputs.

If I physically disconnect either output (or attach both to one GPU), the monitor appears in colormgr get-devices.

If multiple GPU connect to the same monitor, that monitor doesn't appear.

Also Wayland/Gnome.

@KBrand101
Copy link

Same issue here. If I compare the edid information from my two displays with

diff -u <(cat /sys/class/drm/card1-HDMI-A-1/edid | edid-decode) <(cat /sys/class/drm/card1-HDMI-A-2/edid | edid-decode)

it doesn't find any differences.

When checking the hash of the two files, it shows that they are completely identical in my case:

sha256sum /sys/class/drm/card1-HDMI-A-1/edid 
dac843f3dcc8aedacd109883f42b88a0df6872c6cb49e65d9c065b8fc38d5536  /sys/class/drm/card1-HDMI-A-1/edid
sha256sum /sys/class/drm/card1-HDMI-A-2/edid 
dac843f3dcc8aedacd109883f42b88a0df6872c6cb49e65d9c065b8fc38d5536  /sys/class/drm/card1-HDMI-A-2/edid

If I understand @voidptr127 correctly, these files contain the Display Product Serial Number which in my case is identical on both monitors, as well as everything else. As he already pointed out, this leads to messages like the one in his very first post where is says device id 'xxxx' already exists.

I'm on Gnome by the way. And I don't plan on swapping to KDE because of this.

If I recall correctly this worked fine with my old computer and my old OS which was Ubuntu 20.04 I believe. Since then (when I upgraded my pc) I kept the boot drive and upgraded it to Ubuntu 24.04 because of missing firmware for my new GPU. My monitors stayed the same tough. I don't remember if I was already using Wayland back then, but I am now.

Is there any way to hardcode/overwrite the serial number in a config file? Since the computer is stationary and always connected to the same displays, this would totally be acceptable.

Thanks @voidptr127 for the research.
@hughsie: Thanks for creating this software in the first place. Sadly I also don't have a Red Hat subscription but I would also really like to see this fixed. Do you see any chance of this happening any time soon? Is there another possibility to fund the work if necessary?

@hughsie
Copy link
Owner

hughsie commented Nov 7, 2024

@voidptr127 do you know why this code isn't working?

/* If the user has two or more outputs attached with identical EDID data
 * then the client tools cannot tell them apart. By setting this value
 * the 'xrandr-' style device-id is always used and the monitors will
 * show up as seporate instances.
 * This does of course mean that the calibration is referenced to the
 * xrandr output name, rather than the monitor itself. This means that
 * if the monitor cables are swapped then the wrong profile would be
 * used. */
priv->always_use_xrandr_name = cd_main_check_duplicate_edids ();

i.e. we check for duplicate EDIDs -- or is the check too strict and the EDIDs are not actually 100% the same, but do have the same model and serial number? I suspect cd_main_check_duplicate_edids() needs love if that's the case.

@KBrand101
Copy link

I think the issue is not in cd_main_check_duplicate_edids(). On my machine, the code correctly detects the two edid files as equal (having an identical checksum). I checked by running colord -vvvv. But I get no devices returned with colormgr get-devices (as is the original issue).

I have a suggestion though. Would it maybe be worth returning earlier from the function, if use_xrandr_mode = True? Currently it exits after adding the duplicate value to the hash table. But at this point it is already confirmed, that xrandr_mode should be used. And this is all the function determines.
So it would look like this:

// Old
		if (old_output != NULL) {
			g_debug ("output %s has duplicate EDID", fn);
			use_xrandr_mode = TRUE;
		}

// New
		if (old_output != NULL) {
			g_debug ("output %s has duplicate EDID", fn);
			use_xrandr_mode = TRUE;
			return use_xrandr_mode;
		}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants