1
2		The Resource Counter
3
4The resource counter, declared at include/linux/res_counter.h,
5is supposed to facilitate the resource management by controllers
6by providing common stuff for accounting.
7
8This "stuff" includes the res_counter structure and routines
9to work with it.
10
11
12
131. Crucial parts of the res_counter structure
14
15 a. unsigned long long usage
16
17 	The usage value shows the amount of a resource that is consumed
18	by a group at a given time. The units of measurement should be
19	determined by the controller that uses this counter. E.g. it can
20	be bytes, items or any other unit the controller operates on.
21
22 b. unsigned long long max_usage
23
24 	The maximal value of the usage over time.
25
26 	This value is useful when gathering statistical information about
27	the particular group, as it shows the actual resource requirements
28	for a particular group, not just some usage snapshot.
29
30 c. unsigned long long limit
31
32 	The maximal allowed amount of resource to consume by the group. In
33	case the group requests for more resources, so that the usage value
34	would exceed the limit, the resource allocation is rejected (see
35	the next section).
36
37 d. unsigned long long failcnt
38
39 	The failcnt stands for "failures counter". This is the number of
40	resource allocation attempts that failed.
41
42 c. spinlock_t lock
43
44 	Protects changes of the above values.
45
46
47
482. Basic accounting routines
49
50 a. void res_counter_init(struct res_counter *rc,
51				struct res_counter *rc_parent)
52
53 	Initializes the resource counter. As usual, should be the first
54	routine called for a new counter.
55
56	The struct res_counter *parent can be used to define a hierarchical
57	child -> parent relationship directly in the res_counter structure,
58	NULL can be used to define no relationship.
59
60 c. int res_counter_charge(struct res_counter *rc, unsigned long val,
61				struct res_counter **limit_fail_at)
62
63	When a resource is about to be allocated it has to be accounted
64	with the appropriate resource counter (controller should determine
65	which one to use on its own). This operation is called "charging".
66
67	This is not very important which operation - resource allocation
68	or charging - is performed first, but
69	  * if the allocation is performed first, this may create a
70	    temporary resource over-usage by the time resource counter is
71	    charged;
72	  * if the charging is performed first, then it should be uncharged
73	    on error path (if the one is called).
74
75	If the charging fails and a hierarchical dependency exists, the
76	limit_fail_at parameter is set to the particular res_counter element
77	where the charging failed.
78
79 d. int res_counter_charge_locked
80			(struct res_counter *rc, unsigned long val)
81
82	The same as res_counter_charge(), but it must not acquire/release the
83	res_counter->lock internally (it must be called with res_counter->lock
84	held).
85
86 e. void res_counter_uncharge[_locked]
87			(struct res_counter *rc, unsigned long val)
88
89	When a resource is released (freed) it should be de-accounted
90	from the resource counter it was accounted to.  This is called
91	"uncharging".
92
93	The _locked routines imply that the res_counter->lock is taken.
94
95 2.1 Other accounting routines
96
97    There are more routines that may help you with common needs, like
98    checking whether the limit is reached or resetting the max_usage
99    value. They are all declared in include/linux/res_counter.h.
100
101
102
1033. Analyzing the resource counter registrations
104
105 a. If the failcnt value constantly grows, this means that the counter's
106    limit is too tight. Either the group is misbehaving and consumes too
107    many resources, or the configuration is not suitable for the group
108    and the limit should be increased.
109
110 b. The max_usage value can be used to quickly tune the group. One may
111    set the limits to maximal values and either load the container with
112    a common pattern or leave one for a while. After this the max_usage
113    value shows the amount of memory the container would require during
114    its common activity.
115
116    Setting the limit a bit above this value gives a pretty good
117    configuration that works in most of the cases.
118
119 c. If the max_usage is much less than the limit, but the failcnt value
120    is growing, then the group tries to allocate a big chunk of resource
121    at once.
122
123 d. If the max_usage is much less than the limit, but the failcnt value
124    is 0, then this group is given too high limit, that it does not
125    require. It is better to lower the limit a bit leaving more resource
126    for other groups.
127
128
129
1304. Communication with the control groups subsystem (cgroups)
131
132All the resource controllers that are using cgroups and resource counters
133should provide files (in the cgroup filesystem) to work with the resource
134counter fields. They are recommended to adhere to the following rules:
135
136 a. File names
137
138 	Field name	File name
139	---------------------------------------------------
140	usage		usage_in_<unit_of_measurement>
141	max_usage	max_usage_in_<unit_of_measurement>
142	limit		limit_in_<unit_of_measurement>
143	failcnt		failcnt
144	lock		no file :)
145
146 b. Reading from file should show the corresponding field value in the
147    appropriate format.
148
149 c. Writing to file
150
151 	Field		Expected behavior
152	----------------------------------
153	usage		prohibited
154	max_usage	reset to usage
155	limit		set the limit
156	failcnt		reset to zero
157
158
159
1605. Usage example
161
162 a. Declare a task group (take a look at cgroups subsystem for this) and
163    fold a res_counter into it
164
165	struct my_group {
166		struct res_counter res;
167
168		<other fields>
169	}
170
171 b. Put hooks in resource allocation/release paths
172
173 	int alloc_something(...)
174	{
175		if (res_counter_charge(res_counter_ptr, amount) < 0)
176			return -ENOMEM;
177
178		<allocate the resource and return to the caller>
179	}
180
181	void release_something(...)
182	{
183		res_counter_uncharge(res_counter_ptr, amount);
184
185		<release the resource>
186	}
187
188    In order to keep the usage value self-consistent, both the
189    "res_counter_ptr" and the "amount" in release_something() should be
190    the same as they were in the alloc_something() when the releasing
191    resource was allocated.
192
193 c. Provide the way to read res_counter values and set them (the cgroups
194    still can help with it).
195
196 c. Compile and run :)
197