avatar

目录
283 - cpp 类对象大小探究

C++ 类对象大小探究

🧑‍💻 User

In this lecture, we’re going to learn about sizes of class objects and try to think about it.What are we going to see if we try to print the size of our class here called Wrapper?Inside, we have two member variables which are of int type and we have a member variable which is ofstdstring type.What do you think we’ll see when we print the size of this wrapper object here, try to think aboutit.So what c plus plus does, it’s going to only account for the size of your member variables.So it’s going to add the sizes of these guys in memory.So we know an INT is going to take four bytes in memory.So we’re going to have eight bytes for these two aunts and we’re going to have an unspecified amountof memory for a string because a string is internally implemented as a class and it stores its dataas const called pointers.So we have really a pointer, and the size of an object doesn’t really account for things you are referringto using pointers.It may count to the size of the pointer itself, but it’s not going to count what is pointed to.And in the case of string, for example, we really want to know the range of things that we are pointingto.So we’re not going to have correct information with Stdstring here.But in general, the size of the object is going to be the sum of the sizes of its member variables.The functions are not going to be counted then the functions are just helper functions that live somewherein memory, but they’re not going to be accounted for in the size of your object.Another reason you might think functions are not accounted for in the size of the object is that reallyfunctions are affiliated with the class itself.Functions are not affiliated with class objects.The class is going to be the blueprint from which we’re going to generate thousands of objects if wewant.But the functions or the methods are going to be associated to the blueprint itself.So it really wouldn’t make sense to associate the size of those functions in memory to class objects.Now that you have this idea, let’s head over to Visual Studio code and play with this a little more.Okay.We have said that the size is mostly going to be the sum of the sizes for the member variables, butthis is not always correct because of a phenomenon we called boundary alignment.Boundary alignment is a thing most compilers do when they have member variables of different types andmember variables that have the size of four bytes in memory are going to be stored at memory locationsthat are multiples of four.And because of this phenomenon, we might have gaps between our memory.If we have variables, for example, that are of two bytes in memory and four bytes in memory, andwe’re going to have gaps inside and the size of the object is going to be slightly off the sum we mightdo for the sizes of our object.I realize this may be confusing.We’re going to head over to Visual Studio code and we’re going to see a few examples.Okay.Here we are in our working folder.The current project is size of class Objects.We’re going to grab our template files pretty quick and we’re going to put them in place and we’re goingto open our project in Visual Studio code.By dragging and dropping here we have our main CPP file.We can close the left sidebar and we can clean the main function here.We’re going to set up a simple class.Let’s call this dog.Why not?We have been using dog.So let’s use this again.We’re going to put our closing semicolon.We’re going to put a public section and it’s going to just contain a constructor and we’re going tosay dog and we’re going to use the default and we’re going to put in a private section.And in here we’re going to put in a size T and say, this is the leg count of a dog.Why not?Let’s say we are modeling this.Okay?So now we have our dog class in place.We can create objects of this dog class.We’re going to say dog dog one.And if we do this, we are interested in knowing the size of our dog class here.So we are going to say stdcout size of dog and say that.And fingers crossed we’re going to see the size of our dog object here and try to guess what we’re goingto get in the slides.We have seen that the size is mostly going to be the sum of the size of the member variables.And let’s print the size of size T here because size T is our only member variable.So I think it’s going to be helpful to see its size printed out.So we’re going to have some context.If we do that, we’re going to say size of size T and we’re going to save that and we’re going to buildthis program.Let’s do that.We’re going to pass this through.JC You see, the bolt is good and we’re going to bring up a terminal and run this, see what happens.Okay, We see size of size.T is eight and the size of our dark object is eight.And this is really cool because we see that the size of our class object is really the size of our membervariables that we have in here.Let’s set up another member variable and really show you this.So we’re going to have arm count.Okay, so we are just messing with this to learn.I don’t aim for this class here to make some sense practically.We just want to see how the member variables affect the size of our class object.Now we have two member variables and we’re going to try and build.We’re going to use JC to do that and we’re going to clear and run Rooster Now you see that size of dogis 16 and this is really proven that the size of an object is only accounting for the sizes of the membervariables.It’s not going to count the functions that we have here.So if, for example, we had a function that says principal and we have another one that says do something,this is not going to change the size of our object because functions aren’t accounted for in the sizeof our object here.So let’s build and really show you this The world is good.We can clear and run Rooster.We see that we have the same sizes.The size of dog is 16, the size of size, T is eight.We can even use object names here.For example, we are using the class itself, but we can use dog one here and we’re going to get thesame information.Let’s go out and we’re going to clear and run Rooster and you see that we have the same size.So if we change this thing to put in a pointer, we’re going to see something else.Let’s put in an end pointer and let’s say page like we have been doing all along and see what we get.And we know that the size of a pointer on my system, let’s print that out so that we have no doubt.Let’s for example say size of end pointer is going to be size of end pointer.Okay, let’s build.We’re going to pass this through JK and the world is going to be good.We can clear and run Rooster We’re going to see that we have 24 because we have two member variablesthat are going to be eight bytes in memory of size.T They are of type size T and we have a pointer which is going to have a size of eight in memory, butwe don’t care what this pointer is pointing to.If for example, this pointer is pointing to something really large, we don’t care.We’re just going to account for the end pointer here.And this is really phenomenal because if, for example, we tried to set up an stdstring object andby now you should know that Stdstring itself is implemented as a class in C plus plus.So it’s really a class and we can print its size.So let’s set up a string object here.We’re going to say STD string and we’re going to say name and we’re going to initialize this with HelloWorld.Okay, we’re going to see something really strange here.So if we print the size of name here, we’re going to say size of name and we’re going to print that.Let’s see what we get.We’re going to run and we’re going to see the world is good and I can clear and run Rooster.We’re going to see that size of name is 32.Let’s change the text inside our string and see that we change accordingly, because this can reallybe confusing.Let’s say I am the king of the universe.Why not?And we’re going to build again.Let’s see what we get.Fingers crossed this is going to blow your mind.You see, we have the same size for name even if we are changing the text in here.So what is really happening?The string is storing our text here as const char pointer and what we have is a pointer inside and whatwe probably have inside is a member variable, which is a pointer like we have here.So what we will be accounting for in the string class is the size of the pointer, but it’s not thesize of what is pointed to.Okay, so the car pointer may be pointing to thousands of characters in memory, but as long as thecompiler is concerned, it’s just going to use the pointer member variable and it’s going to accountfor it in the size of your object.Hopefully you can understand why the size of our string object is going to be the same regardless ofhow large is the string.And this is really something you need to understand.Otherwise this is going to throw you off as you design your C plus plus applications.And this is really all I had to share in this lecture.The size of your object is mostly going to be determined by the.A member variables you have and the size is going to be the sum of the sizes of your member variables.We are going to stop here in this lecture and the next one, we’re going to try and recap what we sawin this chapter.Go ahead and finish up here and meet me there.


🤖 Assistant

好的,C++ 新同学你好!很高兴能成为你的 C++ 导师。我们会用最有效的方法一步步学习。

C++ 类对象的大小:深入理解 sizeof

本节概要:
我们将探讨 C++ 中类 (class) 对象所占内存大小是如何确定的。核心思想是,对象的大小主要由其成员变量 (member variables) 决定,而成员函数 (member functions) 通常不计入单个对象的大小。我们还会讨论像 std::string 这样的特殊情况以及一个叫做内存对齐 (memory alignment) 的重要概念。


详细解释:

  1. sizeof 运算符 (operator):
    在 C++ 中,sizeof 是一个一元运算符,它返回一个类型或一个变量在内存中占用的字节 (bytes) 数。这对于理解数据如何在内存中存储至关重要。

  2. 类对象大小的基本原则:成员变量的总和

    • 当你创建一个类的对象时,编译器 (compiler) 会为该对象的成员变量分配内存。
    • 最直接的理解是,一个对象的大小约等于其所有成员变量大小的总和。
    • 例如,如果一个类有两个 int 类型的成员变量,而一个 int 在你的系统上占用 4 个字节,那么这个对象的大小至少是 4 + 4 = 8 个字节。
  3. 成员函数不计入对象大小:

    • 类的成员函数(或称方法 (methods))定义了对象可以执行的操作。然而,这些函数的代码本身并不存储在每个对象实例中。
    • 可以把类想象成一个蓝图 (blueprint),而对象是根据这个蓝图建造的房子。蓝图上描述了房子的功能(相当于成员函数),但这些描述本身并不增加每栋房子占用的土地面积。
    • 函数的代码存储在内存中的一个地方(通常是代码段),所有该类的对象共享这些代码。当调用一个对象的成员函数时,程序会跳转到该代码位置执行,并隐式地传递一个指向该对象的指针 (pointer) (通常称为 this 指针),以便函数知道要操作哪个对象的数据。
    • 因此,无论类有多少成员函数,或者函数多么复杂,它们都不会增加类对象本身的大小。
  4. 指针成员变量和 std::string 的特殊性:

    • 指针成员: 如果类有一个指针成员变量,例如 int* p_age;,那么 sizeof 计算的是这个指针变量本身的大小(在 64 位系统上通常是 8 字节,32 位系统上是 4 字节),而不是它所指向的数据的大小。指针指向的内存可能很大,也可能很小,或者甚至可能没有指向任何有效的内存(空指针),但这与指针变量本身的大小无关。
    • std::string std::string 是 C++ 标准库中一个非常常用的类,用于处理文本。理解 sizeof(std::string) 的行为很重要:
      • std::string 内部通常会使用指针来管理实际的字符数据,这些数据存储在堆 (heap) 内存上。
      • 所以,sizeof(std::string) 返回的是 std::string 对象本身管理结构的大小(可能包含指向字符数据的指针、当前字符串长度、已分配容量等成员),而不是它所存储的文本的实际长度。
      • 这意味着,无论 std::string 对象存储 “Hello”还是 “Hello World, this is a very long string”,sizeof(std::string) 的结果通常是相同的(在同一个编译器和平台下)。这是因为它只计算了管理这些字符所需的固定大小的内部成员。
  5. 内存对齐 (Memory Alignment):

    • 虽然我们说对象大小约等于成员变量大小之和,但实际情况可能因为“内存对齐”而略有不同。
    • 为了提高 CPU 访问内存的效率,编译器通常会将变量放置在特定的内存地址上,这些地址是某个数(通常是变量大小或特定值如 4 或 8)的倍数。
    • 为什么需要对齐? CPU 读取未对齐的数据可能需要多次内存访问,或者在某些架构上直接引发硬件异常。对齐访问通常更快。
    • 填充字节 (Padding Bytes): 为了实现对齐,编译器可能会在成员变量之间或在对象的末尾插入一些未使用的字节,称为填充字节。
    • 影响: 由于填充字节的存在,一个对象的实际大小可能会大于其所有成员变量大小的简单总和。
    • 例如,一个类有如下成员:
      cpp
      1
      2
      3
      4
      5
      class Example {
      char a; // 1 byte
      int b; // 4 bytes
      short c; // 2 bytes
      };
      简单相加是 1 + 4 + 2 = 7 字节。但实际大小可能是:
      • char a; (1 byte)
      • 编译器可能填充 3 个字节,使接下来的 int b 从一个能被 4 整除的地址开始。
      • int b; (4 bytes)
      • short c; (2 bytes)
      • 对象总大小也可能需要对齐到某个边界(例如 4 或 8 字节的倍数),所以末尾可能还会填充 2 个字节。
      • 因此,这个 Example 对象的大小可能是 1 (a) + 3 (padding) + 4 (b) + 2 (c) + 2 (padding) = 12 字节,而不是 7 字节。具体取决于编译器和目标平台。

代码示例:

让我们通过代码来验证这些概念。

cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
#include <iostream>
#include <string> // 包含 std::string 所需的头文件

// 定义一个简单的 Dog 类
class Dog {
public:
// 构造函数 (Constructor) - 使用默认的
Dog() = default;

// 成员函数
void print_info() const {
std::cout << "Dog's leg count: " << leg_count_ << std::endl;
std::cout << "Dog's arm count: " << arm_count_ << std::endl;
if (p_age_ != nullptr) {
std::cout << "Dog's age: " << *p_age_ << std::endl;
}
std::cout << "Dog's name: " << name_ << std::endl;
}

void set_age(int age) {
if (p_age_ == nullptr) {
p_age_ = new int; // 在堆上分配内存
}
*p_age_ = age;
}

void set_name(const std::string& name) {
name_ = name;
}

// 析构函数 (Destructor) - 用于释放动态分配的内存
~Dog() {
delete p_age_; // 释放 p_age_ 指向的内存
p_age_ = nullptr; // 好习惯,防止悬挂指针
}

private:
// 成员变量
size_t leg_count_ {4}; // size_t 通常是无符号长整型
size_t arm_count_ {0}; // 初始值
int* p_age_ {nullptr}; // 指向年龄的指针,初始为空指针
std::string name_ {"Buddy"}; // std::string 对象
};

// 用于演示内存对齐的类
class AlignmentDemo {
public:
char c1;
// 编译器可能会在这里插入填充字节
int i;
// 编译器可能会在这里插入填充字节
short s;
// 编译器可能会在这里为整个对象末尾插入填充字节
};

class EmptyClass {}; // 空类

int main() {
std::cout << "--- 基本类型大小 ---" << std::endl;
std::cout << "sizeof(char): " << sizeof(char) << " byte(s)" << std::endl;
std::cout << "sizeof(short): " << sizeof(short) << " byte(s)" << std::endl;
std::cout << "sizeof(int): " << sizeof(int) << " byte(s)" << std::endl;
std::cout << "sizeof(long): " << sizeof(long) << " byte(s)" << std::endl;
std::cout << "sizeof(long long): " << sizeof(long long) << " byte(s)" << std::endl;
std::cout << "sizeof(float): " << sizeof(float) << " byte(s)" << std::endl;
std::cout << "sizeof(double): " << sizeof(double) << " byte(s)" << std::endl;
std::cout << "sizeof(size_t): " << sizeof(size_t) << " byte(s)" << std::endl;
std::cout << "sizeof(int*): " << sizeof(int*) << " byte(s)" << std::endl;
std::cout << "sizeof(std::string): " << sizeof(std::string) << " byte(s)" << std::endl;

std::cout << "\n--- Dog 类对象大小 ---" << std::endl;
Dog dog1;
dog1.set_age(5);
dog1.set_name("Rex");

Dog dog2;
dog2.set_age(10);
// dog2 使用默认名字 "Buddy"

// 打印 Dog 类本身的大小 (通过类型)
std::cout << "sizeof(Dog) class: " << sizeof(Dog) << " byte(s)" << std::endl;
// 打印 Dog 对象的大小 (通过对象实例)
std::cout << "sizeof(dog1) object: " << sizeof(dog1) << " byte(s)" << std::endl;

// 解释:
// Dog 包含:
// 1. size_t leg_count_ (假设 8 字节)
// 2. size_t arm_count_ (假设 8 字节)
// 3. int* p_age_ (假设 8 字节)
// 4. std::string name_ (假设 32 字节,具体值取决于标准库实现)
// 理论上不考虑对齐的简单相加: 8 + 8 + 8 + 32 = 56 字节。
// 实际大小可能会因对齐而略有不同,但通常会接近这个值,或者是这个值的某个对齐后的倍数。

std::cout << "\n--- std::string 行为 ---" << std::endl;
std::string s1 = "short";
std::string s2 = "this is a much longer string";
std::cout << "sizeof(s1) (\"" << s1 << "\"): " << sizeof(s1) << " byte(s), length: " << s1.length() << std::endl;
std::cout << "sizeof(s2) (\"" << s2 << "\"): " << sizeof(s2) << " byte(s), length: " << s2.length() << std::endl;
// 注意:sizeof(s1) 和 sizeof(s2) 的值是相同的,尽管它们存储的字符串长度不同。

std::cout << "\n--- 内存对齐演示 ---" << std::endl;
std::cout << "sizeof(AlignmentDemo): " << sizeof(AlignmentDemo) << " byte(s)" << std::endl;
// AlignmentDemo 包含 char (1), int (4), short (2). 总共 1+4+2 = 7 字节。
// 实际大小可能是 12 字节:
// char c1; // 1 byte
// padding; // 3 bytes (为了让 int i 对齐到 4 字节边界)
// int i; // 4 bytes
// short s; // 2 bytes
// padding; // 2 bytes (为了让整个结构体大小是对齐值(如4或8)的倍数,这里假设是4,所以1+3+4+2 = 10,需要补2到12)
// 具体填充策略取决于编译器。

std::cout << "\n--- 空类大小 ---" << std::endl;
std::cout << "sizeof(EmptyClass): " << sizeof(EmptyClass) << " byte(s)" << std::endl;
// C++ 标准规定,空类的大小不能为 0。通常是 1 字节。
// 这是为了确保空类的不同对象在内存中有不同的地址。

return 0;
}

预期输出分析 (不同系统和编译器可能会略有差异,尤其是 sizeof(std::string) 和对齐后的类大小):

  • 基本类型大小会显示你系统上这些类型的标准大小。
  • sizeof(Dog) 会是其成员 size_t (通常8字节), size_t (8字节), int* (通常8字节), 和 std::string (例如,GCC libstdc++ 中可能是32字节) 的总和,并考虑内存对齐。所以可能是 8+8+8+32 = 56,或者因为对齐而稍大。
  • sizeof(s1)sizeof(s2) 会相同,即使字符串内容长度不同。
  • sizeof(AlignmentDemo) 很可能不是 1+4+2 = 7 字节,而是由于对齐,例如 12 字节。
  • sizeof(EmptyClass) 通常是 1 字节。

QA 闪卡 (Flash Cards):

  1. Q: 计算一个变量或类型在内存中占用多少字节的 C++ 运算符是什么?
    A: sizeof 运算符。
  2. Q: 类对象的大小主要由什么决定?
    A: 主要由其成员变量 (member variables) 的大小总和决定(可能受内存对齐影响)。
  3. Q: 类的成员函数 (member functions) 会增加该类每个对象实例的大小吗?为什么?
    A: 不会。成员函数的代码存储在内存中的一个共享位置,不复制到每个对象中。
  4. Q: 如果一个类有一个 int* ptr; 成员,sizeof 这个对象时,ptr 贡献的大小是多少?
    A: 指针 ptr 本身的大小(例如,64位系统上是8字节),而不是它所指向的 int 数据的大小。
  5. Q: sizeof(std::string) 返回的是字符串中字符的数量吗?
    A: 不是。它返回 std::string 对象内部管理结构的大小,这个大小通常是固定的,与字符串内容的长度无关。
  6. Q: 什么是内存对齐 (memory alignment)?它如何影响对象的大小?
    A: 编译器为了提高 CPU 访问效率,会将成员变量存放在特定地址(通常是其大小的倍数)。这可能导致在成员间或对象末尾添加填充字节 (padding bytes),使得对象的实际大小大于成员大小的简单总和。
  7. Q: 一个没有任何成员变量的空类 (empty class),其大小是多少?
    A: 通常是 1 字节。C++ 标准不允许对象大小为0,以保证不同对象有不同地址。

常见误解或易犯错误:

  1. 混淆 sizeof(std::string)myString.length()
    • 错误: 认为 sizeof(std::string) 会告诉你字符串里有多少个字符。
    • 正确: sizeof(std::string)std::string 这个管理类本身的大小。要获取字符数,应该使用 .length().size() 成员函数。
  2. 忽略内存对齐:
    • 错误: 简单地将所有成员变量的大小相加,就认为这是对象的最终大小。
    • 正确: 最终大小可能因为内存对齐而变大。
  3. 认为成员函数会增加对象大小:
    • 错误: 定义了很多成员函数,担心对象会变得非常庞大。
    • 正确: 成员函数不影响单个对象的大小。
  4. 对指针成员的 sizeof 理解错误:
    • 错误: MyClass { char* name; }; 认为 sizeof(MyClass) 会包括 name 指向的字符串的长度。
    • 正确: sizeof(MyClass) 只包括指针 name 本身的大小。
  5. 在运行时动态改变大小:
    • 错误: 认为 sizeof(MyObject) 的值会随着对象内部动态分配内存(如 std::vector 添加元素)而改变。
    • 正确: sizeof 是一个编译时(或接近编译时)确定的值,它给出的是对象本身固定结构的大小,不包括它可能管理的、在堆上动态分配的内存区域。

编码练习:

下面有一个 Gadget 类。请你思考一下,并尝试填写代码来打印出 Gadget 对象的大小。然后编译并运行,看看你的预测是否准确。

cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <iostream>

class Gadget {
public:
char id_char; // 1 byte
// 编译器可能会在这里插入填充 (padding)
int item_code; // 4 bytes
// 编译器可能会在这里插入填充
short version; // 2 bytes
// 编译器可能会在这里插入填充
bool is_enabled; // 1 byte (在某些实现中bool可能占用更多,但通常是1)
// 编译器可能会在对象末尾插入填充,以使整个对象的大小是某个值的倍数 (例如4或8)

Gadget(char id, int code, short v, bool enabled)
: id_char(id), item_code(code), version(v), is_enabled(enabled) {}

void display_details() const {
std::cout << "ID Char: " << id_char
<< ", Item Code: " << item_code
<< ", Version: " << version
<< ", Enabled: " << (is_enabled ? "true" : "false")
<< std::endl;
}
};

int main() {
Gadget my_gadget('A', 101, 2, true);
my_gadget.display_details();

// 预测一下 my_gadget 的大小会是多少?
// 成员大小总和: sizeof(char) + sizeof(int) + sizeof(short) + sizeof(bool)
// = 1 + 4 + 2 + 1 = 8 bytes.
// 但由于内存对齐,实际大小可能是多少呢?

size_t size_of_gadget;

// TODO: 在这里填写代码,计算 my_gadget 对象的大小,并将其存储在 size_of_gadget 变量中。
// 然后打印出这个大小。
// ----------- 在下面填空 -----------
// size_of_gadget = ... ;
// std::cout << "Size of my_gadget: " << ... << " bytes" << std::endl;
// ---------------------------------

return 0;
}

请你尝试填写 TODO 部分的代码,然后我会根据你的答案给出评分和解释。


评论