1
2PCI Power Management
3~~~~~~~~~~~~~~~~~~~~
4
5An overview of the concepts and the related functions in the Linux kernel
6
7Patrick Mochel <mochel@transmeta.com>
8
9---------------------------------------------------------------------------
10
111. Overview
122. How the PCI Subsystem Does Power Management
133. PCI Utility Functions
144. PCI Device Drivers
155. Resources
16
171. Overview
18~~~~~~~~~~~
19
20The PCI Power Management Specification was introduced between the PCI 2.1 and
21PCI 2.2 Specifications. It a standard interface for controlling various
22power management operations.
23
24Implementation of the PCI PM Spec is optional, as are several sub-components of
25it. If a device supports the PCI PM Spec, the device will have an 8 byte
26capability field in its PCI configuration space. This field is used to describe
27and control the standard PCI power management features.
28
29The PCI PM spec defines 4 operating states for devices (D0 - D3) and for buses
30(B0 - B3). The higher the number, the less power the device consumes. However,
31the higher the number, the longer the latency is for the device to return to
32an operational state (D0).
33
34Bus power management is not covered in this version of this document.
35
36Note that all PCI devices support D0 and D3 by default, regardless of whether or
37not they implement any of the PCI PM spec.
38
39The possible state transitions that a device can undergo are:
40
41+---------------------------+
42| Current State | New State |
43+---------------------------+
44| D0            | D1, D2, D3|
45+---------------------------+
46| D1            | D2, D3    |
47+---------------------------+
48| D2            | D3        |
49+---------------------------+
50| D1, D2, D3    | D0        |
51+---------------------------+
52
53Note that when the system is entering a global suspend state, all devices will
54be placed into D3 and when resuming, all devices will be placed into D0.
55However, when the system is running, other state transitions are possible.
56
572. How The PCI Subsystem Handles Power Management
58~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59
60The PCI suspend/resume functionality is accessed indirectly via the Power
61Management subsystem. At boot, the PCI driver registers a power management
62callback with that layer. Upon entering a suspend state, the PM layer iterates
63through all of its registered callbacks. This currently takes place only during
64APM state transitions.
65
66Upon going to sleep, the PCI subsystem walks its device tree twice. Both times,
67it does a depth first walk of the device tree. The first walk saves each of the
68device's state and checks for devices that will prevent the system from entering
69a global power state. The next walk then places the devices in a low power
70state.
71
72The first walk allows a graceful recovery in the event of a failure, since none
73of the devices have actually been powered down.
74
75In both walks, in particular the second, all children of a bridge are touched
76before the actual bridge itself. This allows the bridge to retain power while
77its children are being accessed.
78
79Upon resuming from sleep, just the opposite must be true: all bridges must be
80powered on and restored before their children are powered on. This is easily
81accomplished with a breadth-first walk of the PCI device tree.
82
83
843. PCI Utility Functions
85~~~~~~~~~~~~~~~~~~~~~~~~
86
87These are helper functions designed to be called by individual device drivers.
88Assuming that a device behaves as advertised, these should be applicable in most
89cases. However, results may vary.
90
91Note that these functions are never implicitly called for the driver. The driver
92is always responsible for deciding when and if to call these.
93
94
95pci_save_state
96--------------
97
98Usage:
99	pci_save_state(dev, buffer);
100
101Description:
102	Save first 64 bytes of PCI config space. Buffer must be allocated by
103	caller.
104
105
106pci_restore_state
107-----------------
108
109Usage:
110	pci_restore_state(dev, buffer);
111
112Description:
113	Restore previously saved config space. (First 64 bytes only);
114
115	If buffer is NULL, then restore what information we know about the
116	device from bootup: BARs and interrupt line.
117
118
119pci_set_power_state
120-------------------
121
122Usage:
123	pci_set_power_state(dev, state);
124
125Description:
126	Transition device to low power state using PCI PM Capabilities
127	registers.
128
129	Will fail under one of the following conditions:
130	- If state is less than current state, but not D0 (illegal transition)
131	- Device doesn't support PM Capabilities
132	- Device does not support requested state
133
134
135pci_enable_wake
136---------------
137
138Usage:
139	pci_enable_wake(dev, state, enable);
140
141Description:
142	Enable device to generate PME# during low power state using PCI PM
143	Capabilities.
144
145	Checks whether if device supports generating PME# from requested state
146	and fail if it does not, unless enable == 0 (request is to disable wake
147	events, which is implicit if it doesn't even support it in the first
148	place).
149
150	Note that the PMC Register in the device's PM Capabilties has a bitmask
151	of the states it supports generating PME# from. D3hot is bit 3 and
152	D3cold is bit 4. So, while a value of 4 as the state may not seem
153	semantically correct, it is.
154
155
1564. PCI Device Drivers
157~~~~~~~~~~~~~~~~~~~~~
158
159These functions are intended for use by individual drivers, and are defined in
160struct pci_driver:
161
162        int  (*save_state) (struct pci_dev *dev, u32 state);
163        int  (*suspend) (struct pci_dev *dev, u32 state);
164        int  (*resume) (struct pci_dev *dev);
165        int  (*enable_wake) (struct pci_dev *dev, u32 state, int enable);
166
167
168save_state
169----------
170
171Usage:
172
173if (dev->driver && dev->driver->save_state)
174	dev->driver->save_state(dev,state);
175
176The driver should use this callback to save device state. It should take into
177account the current state of the device and the requested state in order to
178avoid any unnecessary operations.
179
180For example, a video card that supports all 4 states (D0-D3), all controller
181context is preserved when entering D1, but the screen is placed into a low power
182state (blanked).
183
184The driver can also interpret this function as a notification that it may be
185entering a sleep state in the near future. If it knows that the device cannot
186enter the requested state, either because of lack of support for it, or because
187the device is middle of some critical operation, then it should fail.
188
189This function should not be used to set any state in the device or the driver
190because the device may not actually enter the sleep state (e.g. another driver
191later causes causes a global state transition to fail).
192
193Note that in intermediate low power states, a device's I/O and memory spaces may
194be disabled and may not be available in subsequent transitions to lower power
195states.
196
197
198suspend
199-------
200
201Usage:
202
203if (dev->driver && dev->driver->suspend)
204	dev->driver->suspend(dev,state);
205
206A driver uses this function to actually transition the device into a low power
207state. This may include disabling I/O, memory and bus-mastering, as well as
208physically transitioning the device to a lower power state.
209
210Bus mastering may be disabled by doing:
211
212pci_disable_device(dev);
213
214For devices that support the PCI PM Spec, this may be used to set the device's
215power state:
216
217pci_set_power_state(dev,state);
218
219The driver is also responsible for disabling any other device-specific features
220(e.g blanking screen, turning off on-card memory, etc).
221
222The driver should be sure to track the current state of the device, as it may
223obviate the need for some operations.
224
225The driver should update the current_state field in its pci_dev structure in
226this function.
227
228resume
229------
230
231Usage:
232
233if (dev->driver && dev->driver->suspend)
234	dev->driver->resume(dev)
235
236The resume callback may be called from any power state, and is always meant to
237transition the device to the D0 state.
238
239The driver is responsible for reenabling any features of the device that had
240been disabled during previous suspend calls and restoring all state that was
241saved in previous save_state calls.
242
243If the device is currently in D3, it must be completely reinitialized, as it
244must be assumed that the device has lost all of its context (even that of its
245PCI config space). For almost all current drivers, this means that the
246initialization code that the driver does at boot must be separated out and
247called again from the resume callback. Note that some values for the device may
248not have to be probed for this time around if they are saved before entering the
249low power state.
250
251If the device supports the PCI PM Spec, it can use this to physically transition
252the device to D0:
253
254pci_set_power_state(dev,0);
255
256Note that if the entire system is transitioning out of a global sleep state, all
257devices will be placed in the D0 state, so this is not necessary. However, in
258the event that the device is placed in the D3 state during normal operation,
259this call is necessary. It is impossible to determine which of the two events is
260taking place in the driver, so it is always a good idea to make that call.
261
262The driver should take note of the state that it is resuming from in order to
263ensure correct (and speedy) operation.
264
265The driver should update the current_state field in its pci_dev structure in
266this function.
267
268
269enable_wake
270-----------
271
272Usage:
273
274if (dev->driver && dev->driver->enable_wake)
275	dev->driver->enable_wake(dev,state,enable);
276
277This callback is generally only relevant for devices that support the PCI PM
278spec and have the ability to generate a PME# (Power Management Event Signal)
279to wake the system up. (However, it is possible that a device may support
280some non-standard way of generating a wake event on sleep.)
281
282Bits 15:11 of the PMC (Power Mgmt Capabilities) Register in a device's
283PM Capabilties describe what power states the device supports generating a
284wake event from:
285
286+------------------+
287|  Bit  |  State   |
288+------------------+
289|  15   |   D0     |
290|  14   |   D1     |
291|  13   |   D2     |
292|  12   |   D3hot  |
293|  11   |   D3cold |
294+------------------+
295
296A device can use this to enable wake events:
297
298	 pci_enable_wake(dev,state,enable);
299
300Note that to enable PME# from D3cold, a value of 4 should be passed to
301pci_enable_wake (since it uses an index into a bitmask). If a driver gets
302a request to enable wake events from D3, two calls should be made to
303pci_enable_wake (one for both D3hot and D3cold).
304
305
3065. Resources
307~~~~~~~~~~~~
308
309PCI Local Bus Specification
310PCI Bus Power Management Interface Specification
311
312  http://pcisig.org
313
314